• Anaconda Platform
  • – Welcome
  • – Anaconda
  • – Anaconda Repository
  • – Anaconda Accelerate
  • – Anaconda Adam
  • – Anaconda Enterprise Notebooks
  • – Anaconda Fusion
  • – Anaconda Scale
  • – Anaconda Cloud
  • Continuum-sponsored OSS programs
  • – Blaze
  • – Bokeh
  • – Conda
  • – dask
  • – llvmlite
  • – PhosphorJS
  • – Numba
    • User Manual
    • Reference Manual
    • Numba for CUDA GPUs
      • Overview
      • Writing CUDA Kernels
      • Memory management
      • Writing Device Functions
      • Supported Python features in CUDA Python
      • Supported Atomic Operations
      • Random Number Generation
      • Device management
      • The Device List
      • Examples
      • Debugging CUDA Python with the the CUDA Simulator
      • GPU Reduction
      • CUDA Ufuncs and Generalized Ufuncs
      • Sharing CUDA Memory
      • CUDA Frequently Asked Questions
    • CUDA Python Reference
    • Numba for HSA APUs
    • Extending Numba
    • Developer Manual
    • Numba Enhancement Proposals
    • Glossary
    • Release Notes
  • Top OSS programs

Numba for CUDA GPUsΒΆ

  • Overview
    • Terminology
    • Programming model
    • Requirements
      • Supported GPUs
      • Software
    • Missing CUDA Features
  • Writing CUDA Kernels
    • Introduction
    • Kernel declaration
    • Kernel invocation
      • Choosing the block size
      • Multi-dimensional blocks and grids
    • Thread positioning
      • Absolute positions
      • Further Reading
  • Memory management
    • Data transfer
      • Device arrays
    • Pinned memory
    • Streams
    • Shared memory and thread synchronization
    • Local memory
    • SmartArrays (experimental)
    • Deallocation Behavior
  • Writing Device Functions
  • Supported Python features in CUDA Python
    • Language
      • Execution Model
      • Constructs
    • Built-in types
    • Built-in functions
    • Standard library modules
      • cmath
      • math
      • operator
  • Supported Atomic Operations
    • Example
  • Random Number Generation
    • Example
  • Device management
    • Device Selection
  • The Device List
  • Examples
    • Matrix multiplication
  • Debugging CUDA Python with the the CUDA Simulator
    • Using the simulator
    • Supported features
  • GPU Reduction
    • @reduce
    • class Reduce
  • CUDA Ufuncs and Generalized Ufuncs
    • Example: Basic Example
    • Example: Calling Device Functions
    • Generalized CUDA ufuncs
  • Sharing CUDA Memory
    • Sharing between process
      • Export device array to another process
      • Import IPC memory from another process
  • CUDA Frequently Asked Questions
    • nvprof reports “No kernels were profiled”
Docs Home
Continuum Analytics Home
More Help & Support
2017 Continuum Analytics, Inc.
All Rights Reserved.
Privacy Policy | EULA