Machine Learning: PyTorch 1.12 with TorchArrow library and nvFuser

It took just over a quarter of a year for the development team of the machine learning framework PyTorch to announce another new version, 1.12, after the first release in 2022. A total of 433 developers with 3124 commits have contributed to the realization of a whole range of innovations and improvements in this version. This time also includes some beta versions. For example, TorchArrow, a library for preprocessing batch data for machine learning and AWS S3 integration, is available in a beta version.

According to the description in the blog post, the new TorchArrow library should be able to offer programmers a high-performance and Pandas-like, easy-to-use API that they can use to accelerate their preprocessing workflows and development. To do this, this feature provides a Python DataFrame interface with the following characteristics:

  • A powerful CPU backend, vectorized and extensible user-defined functions (UDFs) with Velox,
  • seamless handover with PyTorch or other model authors, such as tensor collation and easy integration with PyTorch DataLoader and DataPipes as well
  • the zero copy for external readers via Arrow In-Memory Columnar Format.

If you want to test this library yourself, you can access an example in the form of a 10-minute tutorial as well as the installation instructions or the API documentation. The development team also offers a prototype that demonstrates a TorchRec-based training loop using TorchArrow’s on-the-fly preprocessing on GitHub.

As in the previous release, some new functionality is still in beta. This includes a functional API for modules: PyTorch 1.12 introduces this new function to functionally apply module calculation with a given set of parameters. The development team behind the software justify this by saying that the traditional PyTorch module usage pattern, which internally maintains a static set of parameters, is sometimes too restrictive – especially when implementing algorithms for meta-learning, where multiple sets of parameters across optimization steps must be maintained.

The API torch.nn.utils.stateless.functional_call() allows module calculations with full flexibility in the set of parameters used, while not requiring re-implementation of the module in a functional way. Any parameter or buffer present in the module can be replaced with an externally defined value for use in the call.

With release 1.12, the PyTorch development team has also replaced the standard fuser (for Volta and later CUDA accelerators). It now uses nvFuser, which supports a wider range of operations and is faster than NNC, the previous fuser for CUDA devices. The developers promise to explain nvFuser in more detail in an upcoming blog post and also show how the compiler can be used to speed up training for a large number of networks. Interested parties can already find further details on use and troubleshooting in the nvFuser documentation.


To home page

#Machine #Learning #PyTorch #TorchArrow #library #nvFuser

Leave a Comment

Your email address will not be published.