• Anaconda Platform
  • – Welcome
  • – Anaconda Distribution
  • – Anaconda Repository
  • – Anaconda Accelerate
  • – Anaconda Adam
  • – Anaconda Enterprise Notebooks
  • – Anaconda Fusion
  • – Anaconda Scale
  • – Anaconda Cloud
  • Anaconda-sponsored OSS programs
  • – Blaze
  • – Bokeh
  • – Conda
  • – dask
    • Dask
      • Familiar user interface
      • Scales from laptops to clusters
      • Complex Algorithms
      • Index
        • Install Dask
        • Use Cases
        • Examples
        • Dask Cheat Sheet
        • Array
        • Bag
          • Overview
          • Create Dask Bags
          • Store Dask Bags
          • API
        • DataFrame
        • Delayed
        • Futures
        • Machine Learning
        • Distributed Scheduling
        • Scheduler Overview
        • Choosing between Schedulers
        • Shared Memory
        • Scheduling in Depth
        • Inspecting Dask objects
        • Diagnostics
        • Overview
        • Specification
        • Custom Graphs
        • Optimization
        • Debugging
        • Contact and Support
        • Changelog
        • Presentations On Dask
        • Development Guidelines
        • Frequently Asked Questions
        • Comparison to PySpark
        • Opportunistic Caching
        • Internal Data Ingestion
        • Remote Data Services
        • Citations
        • Funding
    • Dask Distributed
  • – llvmlite
  • – PhosphorJS
  • – Numba
  • – Cython

BagΒΆ

Dask.Bag parallelizes computations across a large collection of generic Python objects. It is particularly useful when dealing with large quantities of semi-structured data like JSON blobs or log files.

  • Overview
  • Create Dask Bags
  • Store Dask Bags
  • API
Docs Home
Anaconda Home
More Help & Support
2017 Anaconda, Inc.
All Rights Reserved.
Privacy Policy | EULA