by Ian Ozsvald
Teaser for full tutorial: http://lanyrd.com/2011/europytho...
by Stefano Brilli
CUDA technology permits to exploit the power of modern NVIDIA GPUs. In this talk, after a brief introduction to GPU architecture, we will focus on how CUDA got inside Python through libraries like PyCUDA and others…
By some examples we will show the main concepts and techniques for good GPU programming.
This talk targets anyone who wants to know how to exploit this technology from Python, the suitable use cases, the using techniques and the do-not-using techniques to get the best from his own GPU
Python is an accepted high-level scripting language with a growing community in academia and industry. It is used in a lot of scientific applications in many different scientific fields and in more and more industries, for example, in engineering or life science). In all fields, the use of Python for high-performance and parallel computing is increasing. Several organizations and companies are providing tools or support for Python development. This includes libraries for scientific computing, parallel computing, and MPI. Python is also used on many core architectures and GPUs, for which specific Python interpreters are being developed. A related topic is the performance of the various interpreter and compiler implementations for Python.
The talk gives an overview of Python’s use in HPC and Scientific Computing and gives information on many topics, such as Python on massively parallel systems, GPU programming with Python, scientific libraries in Python, and Python interpreter performance issues. The talk will include examples for scientific codes and applications from many domains.
by Mark Shannon
CPython can be made faster by implementing the sort of
optimizations used in the PyPy VM, and in my HotPy VM.
All the necessary changes can be made without modifying the language or the API.
The CPython VM can be modified to support optimizations by adding
an effective garbage collector and by separating the
virtual-machine state from the real-machine state (like Stackless).
Optimizations can be implemented incrementally.
Since almost all of the optimizations are implemented in the interpreter,
all hardware platforms can benefit.
JIT compiler(s) can then be added for common platforms (intel, ARM, etc.).
For more information see http://hotpy.blogspot.com/