Sidebar

Machine Learning

Machine Learning neme • 3 weeks ago • 100%

PyTorch 2.4 Now Supports Intel GPUs for Faster Workloads

pytorch.org

Machine Learning ericjmorey • 3 weeks ago • 100%

Learn PyTorch for Deep Learning: Zero to Mastery | Free Online Book | Daniel Bourke

www.learnpytorch.io

> ### About this course > **Who is this course for?** > > You: Are a beginner in the field of machine learning or deep learning or AI and would like to learn PyTorch. > > This course: Teaches you PyTorch and many machine learning, deep learning and AI concepts in a hands-on, code-first way. > > If you already have 1-year+ experience in machine learning, this course may help but it is specifically designed to be beginner-friendly. > > **What are the prerequisites?** > - 3-6 months coding Python. > - At least one beginner machine learning course (however this might be able to be skipped, resources are linked for many different topics). > - Experience using Jupyter Notebooks or Google Colab (though you can pick this up as we go along). > - A willingness to learn (most important).

Machine Learning ericjmorey • 4 weeks ago • 96%

What’s Really Going On in Machine Learning? Some Minimal Models | Stephen Wolfram | August 22, 2024

writings.stephenwolfram.com

Machine Learning ericjmorey • 1 month ago • 100%

Data Science Handbook | Curated resources (Free & Paid) to help data scientists learn, grow, and break into the field of data science | Andres Vourakis | Last update Jul 23, 2024

github.com

Andres Vourakis writes: > ### Data Scientist Handbook 2024 > > Curated resources (Free & Paid) to help data scientists learn, grow, and break into the field of data science. > Even though there are hundreds of resources out there (too many to keep track of), I will try to limit them to a maximum of 5 per category to ensure you get the most valuable and relevant resources out there, plus, the whole point of this repository is to help you avoid getting overwhelmed by too many choices. This way you can focus less time researching and more time learning. > ### FAQs > > - **How is curation done?** Curation is based on thorough research, recommendations from people I trust, and my years of experience as a Data Scientist. > - **Are all resources free?** Most resources here will be free, but I will also include paid alternatives if they are truly valuable to your career development. All paid resources include the symbol 💲. > - **How often is the repository updated?** I plan to come back here as often as possible to ensure all resources are still available and relevant and also to add new ones.

Machine Learning ericjmorey • 1 month ago • 100%

Elements of Data Science | Allen B. Downey | July 17, 2024

www.allendowney.com

July 17, 2024 [Allen B. Downey](https://www.allendowney.com/blog/2024/07/17/elements-of-data-science/) writes: > Elements of Data Science is an introduction to data science for people with no programming experience. My goal is to present a small, powerful subset of Python that allows you to do real work with data as quickly as possible. > > Part 1 includes six chapters that introduce basic Python with a focus on working with data. > > Part 2 presents exploratory data analysis using Pandas and empiricaldist — it includes a revised and updated version of the material from my popular DataCamp course, “Exploratory Data Analysis in Python.” > > Part 3 takes a computational approach to statistical inference, introducing resampling method, bootstrapping, and randomization tests. > > Part 4 is the first of two case studies. It uses data from the General Social Survey to explore changes in political beliefs and attitudes in the U.S. in the last 50 years. The data points on the cover are from one of the graphs in this section. > > Part 5 is the second case study, which introduces classification algorithms and the metrics used to evaluate them — and discusses the challenges of algorithmic decision-making in the context of criminal justice. > > This project started in 2019, when I collaborated with a group at Harvard to create a data science class for people with no programming experience. We discussed some of the design decisions that went into the course and the book in this article. Read [Elements of Data Science](https://allendowney.github.io/ElementsOfDataScience/) in the form of Jupyter notebooks.

Machine Learning ericjmorey • 2 months ago • 12%

We are Eureka Labs and we are building a new kind of school that is AI native. | Andrej Karpathy | July 16, 2024 https://eurekalabs.ai/

-6

Machine Learning neme • 3 months ago • 96%

Researchers upend AI status quo by eliminating matrix multiplication in LLMs

arstechnica.com

Machine Learning ericjmorey • 3 months ago • 100%

Meta (Facebook) is sharing new research, models, and datasets from Meta FAIR

ai.meta.com

Machine Learning Akisamb • 3 months ago • 60%

LGGMs a new class of graph generative models trained on a large corpus of graphs

www.marktechpost.com

Machine Learning neme • 3 months ago • 80%

Introducing Apple’s On-Device and Server Foundation Models

machinelearning.apple.com

Machine Learning ericjmorey • 3 months ago • 84%

Let's reproduce GPT-2 (124M) | Andrej Karpathy | Jun 9, 2024 https://www.youtube.com/watch?v=l8pRSuU81PU&ab_channel=AndrejKarpathy

Video description: > We reproduce the GPT-2 (124M) from scratch. > >This video covers the whole process: > > First we build the GPT-2 network, then we optimize its training to be really fast, then we set up the training run following the GPT-2 and GPT-3 paper and their hyperparameters, then we hit run, and come back the next morning to see our results, and enjoy some amusing model generations. > > Keep in mind that in some places this video builds on the knowledge from earlier videos in the Zero to Hero Playlist (see my channel). You could also see this video as building my nanoGPT repo, which by the end is about 90% similar.

Machine Learning ericjmorey • 4 months ago • 100%

A Few Useful Things to Know About Machine Learning | Tapping into the "folk knowledge" needed to advance machine learning | Pedro Domingos | Communications of the ACM | vol. 55 no. 10 | October 2012 https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/

[Pedro Domingos](https://scholar.google.com/citations?hl=en&user=KOrhfVMAAAAJ) summarizes 12 key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions. - 1. [Learning = Representation + Evaluation + Optimization](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-3) - 2. [It's Generalization that Counts](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-4) - 3. [Data Alone Is Not Enough](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-5) - 4. [Overfitting Has Many Faces](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-6) - 5. [Intuition Fails in High Dimensions](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-7) - 6. [Theoretical Guarantees Are Not What They Seem](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-8) - 7. [Feature Engineering Is The Key](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-9) - 8. [More Data Beats a Cleverer Algorithm](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-10) - 9. [Learn Many Models, Not Just One](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-11) - 10. [Simplicity Does Not Imply Accuracy](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-12) - 11. [Representable Does Not Imply Learnable](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-13) - 12. [Correlation Does Not Imply Causation](https://cacm.acm.org/research/a-few-useful-things-to-know-about-machine-learning/#body-14)

Machine Learning ericjmorey • 5 months ago • 100%

Multicollinearity in Logistic Regression Models : Anesthesia & Analgesia https://journals.lww.com/anesthesia-analgesia/fulltext/2021/08000/multicollinearity_in_logistic_regression_models.12.aspx

Bayman, Emine Ozgur PhD*; Dexter, Franklin MD, PhD, FASA†. Multicollinearity in Logistic Regression Models. Anesthesia & Analgesia 133(2):p 362-365, August 2021. | DOI: 10.1213/ANE.0000000000005593

Machine Learning Akisamb • 5 months ago • 100%

[PLDI'23] Scallop: A Language for Neurosymbolic Programming https://www.youtube.com/watch?v=jkgnQ-BoTDM

cross-posted from: https://lemmy.one/post/13942290 > Abstract: We present Scallop, a language which combines the benefits of deep learning and logical reasoning. Scallop enables users to write a wide range of neurosymbolic applications and train them in a data- and compute-efficient manner. It achieves these goals through three key features: 1) a flexible symbolic representation that is based on the relational data model; 2) a declarative logic programming language that is based on Datalog and supports recursion, aggregation, and negation; and 3) a framework for automatic and efficient differentiable reasoning that is based on the theory of provenance semirings. We evaluate Scallop on a suite of eight neurosymbolic applications from the literature. Our evaluation demonstrates that Scallop is capable of expressing algorithmic reasoning in diverse and challenging AI tasks, provides a succinct interface for machine learning programmers to integrate logical domain knowledge, and yields solutions that are comparable or superior to state-of-the-art models in terms of accuracy. Furthermore, Scallop's solutions outperform these models in aspects such as runtime and data efficiency, interpretability, and generalizability.

Machine Learning ericjmorey • 5 months ago • 100%

An exploration of multicollinearity | Google Colab

colab.research.google.com

[Original post](https://old.reddit.com/r/learnmachinelearning/comments/1chiiw6/the_full_story_behind_multicollinearity/) on r/learnmachinelearning

Machine Learning ericjmorey • 5 months ago • 87%

Activation function and GLU variants for Transformer models | Tarique Anwar | Apr 18, 2022 https://medium.com/@tariqanwarph/activation-function-and-glu-variants-for-transformer-models-a4fcbe85323f

Apr 18, 2022 | Tarique Anwar Writes: > The main reason for ReLu being used is that it is simple, fast, and empirically it seems to work well. > > But with the emergence of Transformer based models, different variants of activation functions and GLU have been experimented with and do seem to perform better. Some of them are: > > - GeLU² > - Swish¹ > - GLU³ > - GEGLU⁴ > - SwiGLU⁴ > > We will go over some of these in detail but before that let’s see where exactly are these activations utilized in a Transformer architecture. Read [Activation function and GLU variants for Transformer models](https://medium.com/@tariqanwarph/activation-function-and-glu-variants-for-transformer-models-a4fcbe85323f)

Machine Learning ericjmorey • 6 months ago • 100%

Navigating Neural Networks: Exploring State-of-the-Art Activation Functions – OMSCS 7641: Machine Learning https://sites.gatech.edu/omscs7641/2024/01/31/navigating-neural-networks-exploring-state-of-the-art-activation-functions/

> # Summary > Activation functions are crucial in neural networks, introducing non-linearity and enabling the modeling of complex patterns across varied tasks. This guide delves into the evolution, characteristics, and applications of state-of-the-art activation functions, illustrating their role in enhancing neural network performance. It discusses the transition from classic functions like sigmoid and tanh to advanced ones such as ReLU and its variants, addressing challenges like the vanishing gradient problem and the dying ReLU issue. Concluding with practical heuristics for selecting activation functions, the article emphasizes the importance of considering network architecture and task specifics, highlighting the rich diversity of activation functions available for optimizing neural network designs.

Machine Learning Akisamb • 6 months ago • 100%

Qwen1.5-MoE-A2.7B: A Small MoE Model with only 2.7B Activated Parameters yet Matching the Performance of State-of-the-Art 7B models

www.marktechpost.com

Machine Learning ericjmorey • 6 months ago • 100%

Centralized Forecasting Service: Revolutionizing Time Series Models at Sam’s Club

tech.walmart.com

Machine Learning ericjmorey • 6 months ago • 100%

Python Data Science Day will be taking place March 14th, 2024; a “PyDay” on Pi Day

devblogs.microsoft.com

[Dawn Wages](https://devblogs.microsoft.com/python/author/dawnwages/) writes: > Python Data Science Day is a full day of 25 min and 5 min community contributed content March 14th, 2024 streaming on the [VS Code YouTube channel](https://www.youtube.com/@code).

Machine Learning ericjmorey • 6 months ago • 80%

Python Data Science Cloud Skills Challenge | Microsoft Learn

learn.microsoft.com

> Start 2024 with a new goal: become an expert with Python in the cloud. Join us this quarter as we challenge ourselves with Python, Machine Learning and Data Science. > > 7 hr 1 min | 10 Modules

Machine Learning Akisamb • 6 months ago • 100%

Midjourney bans all Stability AI employees over alleged data scraping

www.theverge.com

cross-posted from: https://lemmy.ml/post/13088944

Machine Learning Akisamb • 6 months ago • 100%

The Annotated S4 https://srush.github.io/annotated-s4/

Machine Learning ericjmorey • 6 months ago • 100%

Monte-Carlo Graph Search from First Principles

github.com

Machine Learning ericjmorey • 7 months ago • 100%

Image denoising techniques: A comparison of PCA, kernel PCA, autoencoder, and CNN

www.fabriziomusacchio.com

June 21, 2023 | [Fabrizio Musacchio](https://www.fabriziomusacchio.com/) writes: > In this post, we will explore the potential of PCA [Principal Component Analysis], denoising autoencoders and Convolutional Neural Networks (CNN) for restoring noisy images using Python. We will examine their performance, advantages, and disadvantages to determine the most effective method for image denoising.

Machine Learning ericjmorey • 7 months ago • 100%

Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing https://www.research.autodesk.com/publications/same-stats-different-graphs/

The code https://github.com/jmatejka/same-stats-different-graphs

Machine Learning Kissaki • 7 months ago • 100%

Opus 1.5 Released - achieves audible, non-stuttering talk at 90% packet loss https://opus-codec.org/demo/opus-1.5/

cross-posted from: https://programming.dev/post/11034601 There's a lot, and specifically a lot of machine learning talk and features in the 1.5 release of Opus - the free and open audio codec. Audible and continuous (albeit jittery) talk on 90% packet loss is crazy. Section *WebRTC Integration* → *Samples* has an example where you can test out the 90 % packet loss audio.

Machine Learning Kissaki • 7 months ago • 64%

Scientists warn of AI collapse - Converging Results (Video) https://www.youtube.com/watch?v=NcH7fHtqGYM

Machine Learning ericjmorey • 7 months ago • 100%

Where Is Noether's Principle in Machine Learning? | 2024-02-29 https://cgad.ski/blog/where-is-noethers-principle-in-machine-learning.html

2024-02-29 | [Christopher Gadzinski](https://cgad.ski/) writes: > Physics likes optimization! Subject to its boundary conditions, the time evolution of a physical system is a critical point for a quantity called an action. This point of view sets the stage for Noether's principle, a remarkable correspondence between continuous invariances of the action and conservation laws of the system. > > In machine learning, we often deal with discrete "processes" whose control parameters are chosen to minimize some quantity. For example, we can see a deep residual network as a process where the role of "time" is played by depth. We may ask: > > 1. Does Noether's theorem apply to these processes? > 2. Can we find meaningful conserved quantities? > > Our answers: "yes," and "not sure!"

Machine Learning ericjmorey • 7 months ago • 83%

Why Is the Current XAI Not Meeting the Expectations? – Communications of the ACM https://cacm.acm.org/opinion/why-is-the-current-xai-not-meeting-the-expectations/

XAI = Explainable Artificial Intelligence Dec 14 2023 [Alessio Malizia](https://cacm.acm.org/author/alessio-malizia/) and [Fabio Paternò](https://cacm.acm.org/author/fabio-paterno/) write: > Numerous papers argue for using XAI methods in the literature, as well as multiple suggestions for brand-new XAI family approaches. Nevertheless, finding instances of practical XAI technique implementations that have enhanced the business in industry/societal/real-world applications is more challenging, even if some interesting work in this area has been put forward, for example in the health domain Read [Why Is the Current XAI Not Meeting the Expectations?](https://cacm.acm.org/opinion/why-is-the-current-xai-not-meeting-the-expectations/)

Machine Learning ericjmorey • 8 months ago • 85%

A One-Stop Shop for Principal Component Analysis https://towardsdatascience.com/a-one-stop-shop-for-principal-component-analysis-5582fb7e0a9c

Apr 17, 2017 [Matt Brems](https://matthew-brems.medium.com/) writes: > Principal component analysis (PCA) is an important technique to understand in the fields of statistics and data science… but when putting a lesson together for my General Assembly students, I found that the resources online were too technical, didn’t fully address our needs, and/or provided conflicting information. It’s safe to say that I’m not “entirely satisfied with the available texts” here. > > As a result, I wanted to put together the “What,” “When,” “How,” and “Why” of PCA as well as links to some of the resources that can help to further explain this topic. Specifically, I want to present the rationale for this method, the math under the hood, some best practices, and potential drawbacks to the method. > > While I want to make PCA as accessible as possible, the algorithm we’ll cover is pretty technical. Being familiar with some or all of the following will make this article and PCA as a method easier to understand: matrix operations/linear algebra (matrix multiplication, matrix transposition, matrix inverses, matrix decomposition, eigenvectors/eigenvalues) and statistics/machine learning (standardization, variance, covariance, independence, linear regression, feature selection). I’ve embedded links to illustrations of these topics throughout the article, but hopefully these will serve as a reminder rather than required reading to get through the article. Read [A One-Stop Shop for Principal Component Analysis](https://towardsdatascience.com/a-one-stop-shop-for-principal-component-analysis-5582fb7e0a9c)

Machine Learning ericjmorey • 8 months ago • 100%

Spark vs Presto: A Comprehensive Comparison https://www.analyticsvidhya.com/blog/2023/12/spark-vs-presto-a-comprehensive-comparison/

cross-posted from: https://programming.dev/post/9436800 > December 28 2023 [Pankaj Singh](https://www.analyticsvidhya.com/blog/author/pankaj9786/) writes: > > > In big data processing and analytics, choosing the right tool is paramount for efficiently extracting meaningful insights from vast datasets. Two popular frameworks that have gained significant traction in the industry are Apache Spark and Presto. Both are designed to handle large-scale data processing efficiently, yet they have distinct features and use cases. As organizations grapple with the complexities of handling massive volumes of data, a comprehensive understanding of Spark and Presto’s nuances and distinctive features becomes essential. In this article, we will compare Spark vs Presto, exploring their performance and scalability, data processing capabilities, ecosystem, integration, and use cases and applications. > > Read [Spark vs Presto: A Comprehensive Comparison](https://www.analyticsvidhya.com/blog/2023/12/spark-vs-presto-a-comprehensive-comparison/)

Machine Learning ericjmorey • 8 months ago • 100%

Offline listening and speaking bot

github.com

cross-posted from: https://lemmy.world/post/11196216 > Hi all, > > For those wanting a quick repo to use as a basis to get started, I’ve created jen-ai. > > There are full instructions in the readme. Once running you can talk to it, and it will respond. > > It’s basic, but a place to start.

Machine Learning ericjmorey • 8 months ago • 100%

My Target Audience | The Two Culture Problem

scientificcoder.com

cross-posted from: https://programming.dev/post/9428234 > Apr 18, 2023 [Matthijs Cox](https://hashnode.com/@matthijscox) writes: > > > The “two language problem” was globally accepted for several decades. You just learn to live with it. Until one day it was challenged by the Julia language, a programming language that promises both speed and ease of use. As I feel the pain of the two language problem deeply, I wanted to try out this new solution. So together with several allies I went on a mission to adopt this new technology at work, and remove the bottleneck. > > > > While we had initial success and attention, we quickly stumbled into resistance from the existing groups of researchers/scientists and developers. Over time I have named this the “two culture problem”. In the beginning I didn’t see the cultures clearly, which limited our success. I was too focused on the technological problem itself. > > > > I will refer to the two cultures as “scientists” versus “developers”. However, the “scientists” group generalizes to anyone who codes quick and dirty to explore, such as domain experts, data analysts and others like that. I do hope everyone is doing their exploration somewhat scientifically, so the generalization should makes sense. Scientists typically want to get their stuff done, perhaps with code, but they don't care about the code. Software developers care deeply about the code craftsmanship, sometimes obsessively so, but often developers barely understand the business domain or science. There are people near the middle, trying to balance both, but they are a rare breed. > > Read [My Target Audience](https://scientificcoder.com/my-target-audience)

Machine Learning Chruesimuesi • 8 months ago • 100%

The Math behind Adam Optimizer | Towards Data Science

towardsdatascience.com

The article discusses the Adam optimizer, a popular algorithm in deep learning known for its efficiency in adjusting learning rates for different parameters. Unlike other optimizers like SGD or Adagrad, Adam dynamically changes its step size based on the complexity of the problem, analogous to adjusting stride in varying terrains. This ability to adapt makes it effective in quickly finding the minimum loss in machine learning tasks, a key reason for its popularity in winning Kaggle competitions and among those seeking a deeper understanding of optimizer mechanics.

Machine Learning ericjmorey • 8 months ago • 94%

Code Llama 70B: a new, more performant version of Meta's LLM for code generation is now available. https://twitter.com/AIatMeta/status/1752013879532782075

[AI at Meta writes](https://twitter.com/AIatMeta/status/1752013879532782075): > Today we’re releasing Code Llama 70B: a new, more performant version of our LLM for code generation — available under the same license as previous Code Llama models. > > [Download the models](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) > - CodeLlama-70B > - CodeLlama-70B-Python > - CodeLlama-70B-Instruct > > CodeLlama-70B-Instruct achieves 67.8 on HumanEval, making it one of the highest performing open models available today. > > CodeLlama-70B is the most performant base for fine-tuning code generation models and we’re excited for the community to build on this work. [Nitter link](https://nitter.net/AIatMeta/status/1752013879532782075) [Try the model on Hugging Face](https://huggingface.co/codellama) [Installable using Ollama](https://github.com/ollama/ollama)

Machine Learning ericjmorey • 8 months ago • 100%

Real World Data Science Competition | When will the cherry trees bloom?

realworlddatascience.net

[Real World Data Science](https://realworlddatascience.net/about-rwds.html) Writes: > The 2024 International Cherry Blossom Prediction Competition will open for entries on February 1, and Real World Data Science is once again proud to be a sponsor. > > Contestants are invited to submit predictions for the date cherry trees will bloom in 2024 at five different locations – Kyoto, Japan; Liestal-Weideli, Switzerland; Vancouver, Canada; and Washington, DC and New York City, USA. Read details about [Real World Data Science Competition | When will the cherry trees bloom?](https://realworlddatascience.net/viewpoints/editors-blog/posts/2024/01/18/cherry-blossom.html)

Machine Learning ericjmorey • 8 months ago • 100%

NumPy 2 is coming: preventing breakage, updating your code

pythonspeed.com

cross-posted from: https://programming.dev/post/8724281 > [Itamar Turner-Trauring](https://pythonspeed.com/about/) writes: > > > These sort of problems are one of the many reasons you want to “pin” your application’s dependencies: make sure you only install a specific, fixed set of dependencies. Without reproducible dependencies, as soon as NumPy 2 comes out your application might break when it gets installed with new dependencies. > > > > The really short version is that you have two sets of dependency configurations: > > > > - **A direct dependency list**: A list of libraries you directly import in your code, loosely restricted. This is the list of dependencies you put in pyproject.toml or setup.py. > > - **A lock file**: A list of all dependencies you rely on, direct or indirect (dependencies of dependencies), pinned to specific versions. This might be a requirements.txt, or some other file dependencies on which tool you’re using. > > > > [At appropriate intervals you update the lock file](https://pythonspeed.com/articles/when-update-dependencies/) based on the direct dependency list. > > > > I’ve written multiple articles on the topic, in case you’re not familiar with the relevant tools: > > > > - “[Faster Docker builds with pipenv, poetry, or pip-tools](https://pythonspeed.com/articles/pipenv-docker/)” covers using those three tools to maintain lockfiles. > > - For Conda, see “[Reproducible and upgradable Conda environments with conda-lock](https://pythonspeed.com/articles/activate-conda-dockerfile/)”. > > Read [NumPy 2 is coming: preventing breakage, updating your code](https://pythonspeed.com/articles/numpy-2/)

Machine Learning ericjmorey • 8 months ago • 100%

The Random Transformer | Understand how transformers work by demystifying all the math behind them https://osanseviero.github.io/hackerllama/blog/posts/random_transformer/

January 1, 2024 - [Omar Sanseviero](https://osanseviero.github.io/hackerllama/) writes: > In this blog post, we’ll do an end-to-end example of the math within a transformer model. The goal is to get a good understanding of how the model works. To make this manageable, we’ll do lots of simplification. As we’ll be doing quite a bit of the math by hand, we’ll reduce the dimensions of the model. For example, rather than using embeddings of 512 values, we’ll use embeddings of 4 values. This will make the math easier to follow! We’ll use random vectors and matrices, but you can use your own values if you want to follow along. Read [The Random Transformer | Understand how transformers work by demystifying all the math behind them](https://osanseviero.github.io/hackerllama/blog/posts/random_transformer/)

Machine Learning ericjmorey • 8 months ago • 77%

GPU-based ODE solvers which are 20x-100x faster than those in #jax and #pytorch

cross-posted from: https://programming.dev/post/8391233 > [Dr. Chris Rackauckas > (@chrisrackauckas@fosstodon.org) writes](https://fosstodon.org/@chrisrackauckas/111631212096602233): > > > #julialang GPU-based ODE solvers which are 20x-100x faster than those in #jax and #pytorch? Check out the paper on how #sciml DiffEqGPU.jl works. Instead of relying on high level array intrinsics that #machinelearning libraries use, it uses a direct kernel generation approach to greatly reduce the overhead. > > Read [Automated translation and accelerated solving of differential equations on multiple GPU platforms](https://www.sciencedirect.com/science/article/abs/pii/S0045782523007156)