API

This page contains a comprehensive list of functionality within blaze. Docstrings should provide sufficient understanding for any individual function or class.

Interactive Use

_Data(data_source, dshape[, name]) Bind a data resource to a symbol, for use in expressions and computation.

Expressions

Projection(*args, **kwargs) Select a subset of fields from data.
Selection(*args, **kwargs) Filter elements of expression based on predicate
Label(*args, **kwargs) An expression with a name.
ReLabel(*args, **kwargs) Table with same content but with new labels
Map(*args, **kwargs) Map an arbitrary Python function across elements in a collection
Apply(*args, **kwargs) Apply an arbitrary Python function onto an expression
Coerce(*args, **kwargs) Coerce an expression to a different type.
Coalesce(*args, **kwargs) SQL like coalesce.
Cast(*args, **kwargs) Cast an expression to a different type.
Sort(*args, **kwargs) Table in sorted order
Distinct(*args, **kwargs) Remove duplicate elements from an expression
Head(*args, **kwargs) First n elements of collection
Merge(*args, **kwargs) Merge many fields together
Join(*args, **kwargs) Join two tables on common columns
Concat(*args, **kwargs) Stack tables on common columns
IsIn(*args, **kwargs) Check if an expression contains values from a set.
By(*args, **kwargs) Split-Apply-Combine Operator

Blaze Server

Server([data, formats, authorization, ...]) Blaze Data Server
Client(url[, serial, verify_ssl, auth]) Client for Blaze Server

Additional Server Utilities

expr_md5(expr) Returns the md5 hash of the str of the expression.
to_tree(expr[, names]) Represent Blaze expression with core data structures
from_tree(expr[, namespace]) Convert core data structures to Blaze expression
data_spider(path[, ignore, followlinks, ...]) Traverse a directory and call blaze.data on its contents.
from_yaml(fh[, ignore, followlinks, hidden, ...]) Construct a dictionary of resources from a YAML specification.

Definitions

blaze.interactive.data(data_source, dshape=None, name=None, fields=None, schema=None, **kwargs)

Bind a data resource to a symbol, for use in expressions and computation.

A data object presents a consistent view onto a variety of concrete data sources. Like symbol objects, they are meant to be used in expressions. Because they are tied to concrete data resources, data objects can be used with compute directly, making them convenient for interactive exploration.

Parameters:
  • data_source (object) – Any type with discover and compute implementations
  • fields (list, optional) – Field or column names, will be inferred from data_source if possible
  • dshape (str or DataShape, optional) – DataShape describing input data
  • name (str, optional) – A name for the data.

Examples

>>> t = data([(1, 'Alice', 100),
...           (2, 'Bob', -200),
...           (3, 'Charlie', 300),
...           (4, 'Denis', 400),
...           (5, 'Edith', -500)],
...          fields=['id', 'name', 'balance'])
>>> t[t.balance < 0].name
    name
0    Bob
1  Edith
blaze.server.spider.data_spider(path, ignore=(<type 'exceptions.ValueError'>, <type 'exceptions.NotImplementedError'>), followlinks=True, hidden=False, extra_kwargs=None)

Traverse a directory and call blaze.data on its contents.

Parameters:
  • path (str) – Path to a directory of resources to load
  • ignore (tuple of Exception, optional) – Ignore these exceptions when calling blaze.data
  • followlinks (bool, optional) – Follow symbolic links
  • hidden (bool, optional) – Load hidden files
  • extra_kwargs (dict, optional) – extra kwargs to forward on to blaze.data
Returns:

Possibly nested dictionary of containing basenames mapping to resources

Return type:

dict

blaze.server.spider.from_yaml(fh, ignore=(<type 'exceptions.ValueError'>, <type 'exceptions.NotImplementedError'>), followlinks=True, hidden=False, relative_to_yaml_dir=False)

Construct a dictionary of resources from a YAML specification.

Parameters:
  • fh (file) – File object referring to the YAML specification of resources to load.
  • ignore (tuple of Exception, optional) – Ignore these exceptions when calling blaze.data.
  • followlinks (bool, optional) – Follow symbolic links.
  • hidden (bool, optional) – Load hidden files.
  • relative_to_yaml_dir (bool, optional, default False) – Load paths relative to yaml file’s directory. Default is to load relative to process’ CWD.
Returns:

A dictionary mapping top level keys in a YAML file to resources.

Return type:

dict

See also

data_spider()
Traverse a directory tree for resources
class blaze.server.server.Server(data=None, formats=None, authorization=None, allow_profiler=False, profiler_output=None, profile_by_default=False, allow_add=False)

Blaze Data Server

Host local data through a web API

Parameters:
  • data (dict, optional) – A dictionary mapping dataset name to any data format that blaze understands.
  • formats (iterable, optional) – An iterable of supported serialization formats. By default, the server will support JSON. A serialization format is an object that supports: name, loads, and dumps.
  • authorization (callable, optional) – A callable to be used to check the auth header from the client. This callable should accept a single argument that will either be None indicating that no header was passed, or an object containing a username and password attribute. By default, all requests are allowed.
  • allow_profiler (bool, optional) – Allow payloads to specify “profile”: true which will run the computation under cProfile.
  • profiler_output (str, optional) –

    The directory to write pstats files after profile runs. The files will be written in a structure like:

    {profiler_output}/{hash(expr)}/{timestamp}

    This defaults to a relative path of profiler_output. This requires allow_profiler=True.

    If this is the string ‘:response’ then writing to the local filesystem is disabled. Only requests that specify profiler_output=’:response’ will be served. All others will return a 403 (Forbidden).

  • profile_by_default (bool, optional) – Run the profiler on any computation that does not explicitly set “profile”: false. This requires allow_profiler=True.
  • allow_add (bool, optional) – Expose an /add endpoint to allow datasets to be dynamically added to the server. Since this increases the risk of security holes, it defaults to False.

Examples

>>> from pandas import DataFrame
>>> df = DataFrame([[1, 'Alice',   100],
...                 [2, 'Bob',    -200],
...                 [3, 'Alice',   300],
...                 [4, 'Dennis',  400],
...                 [5,  'Bob',   -500]],
...                columns=['id', 'name', 'amount'])
>>> server = Server({'accounts': df})
>>> server.run() 
run(port=6363, retry=False, **kwargs)

Run the server.

Parameters:
  • port (int, optional) – The port to bind to.
  • retry (bool, optional) – If the port is busy, should we retry with the next available port?
  • **kwargs – Forwarded to the underlying flask app’s run method.

Notes

This function blocks forever when successful.

blaze.server.server.to_tree(expr, names=None)

Represent Blaze expression with core data structures

Transform a Blaze expression into a form using only strings, dicts, lists and base types (int, float, datetime, ....) This form can be useful for serialization.

Parameters:expr (Expr) – A Blaze expression

Examples

>>> t = symbol('t', 'var * {x: int32, y: int32}')
>>> to_tree(t) 
{'op': 'Symbol',
 'args': ['t', 'var * { x : int32, y : int32 }', False]}
>>> to_tree(t.x.sum()) 
{'op': 'sum',
 'args': [{'op': 'Column',
           'args': [{'op': 'Symbol'
                     'args': ['t',
                              'var * { x : int32, y : int32 }',
                              False]}
                    'x']}]}

Simplify expresion using explicit names dictionary. In the example below we replace the Symbol node with the string 't'.

>>> tree = to_tree(t.x, names={t: 't'})
>>> tree 
{'op': 'Column', 'args': ['t', 'x']}
>>> from_tree(tree, namespace={'t': t})
t.x

See also

from_tree()

blaze.server.server.from_tree(expr, namespace=None)

Convert core data structures to Blaze expression

Core data structure representations created by to_tree are converted back into Blaze expressions.

Parameters:expr (dict) –

Examples

>>> t = symbol('t', 'var * {x: int32, y: int32}')
>>> tree = to_tree(t)
>>> tree 
{'op': 'Symbol',
 'args': ['t', 'var * { x : int32, y : int32 }', False]}
>>> from_tree(tree)
<`t` symbol; dshape='var * {x: int32, y: int32}'>
>>> tree = to_tree(t.x.sum())
>>> tree 
{'op': 'sum',
 'args': [{'op': 'Field',
           'args': [{'op': 'Symbol'
                     'args': ['t',
                              'var * {x : int32, y : int32}',
                              False]}
                    'x']}]}
>>> from_tree(tree)
sum(t.x)

Simplify expresion using explicit names dictionary. In the example below we replace the Symbol node with the string 't'.

>>> tree = to_tree(t.x, names={t: 't'})
>>> tree 
{'op': 'Field', 'args': ['t', 'x']}
>>> from_tree(tree, namespace={'t': t})
t.x

See also

to_tree()

blaze.server.server.expr_md5(expr)

Returns the md5 hash of the str of the expression.

Parameters:expr (Expr) – The expression to hash.
Returns:hexdigest – The hexdigest of the md5 of the str of expr.
Return type:str
class blaze.server.client.Client(url, serial=<SerializationFormat: 'json'>, verify_ssl=True, auth=None, **kwargs)

Client for Blaze Server

Provides programmatic access to datasets living on Blaze Server

Parameters:
  • url (str) – URL of a Blaze server
  • serial (SerializationFormat, optional) – The serialization format object to use. Defaults to JSON. A serialization format is an object that supports: name, loads, and dumps.
  • verify_ssl (bool, optional) – Verify the ssl certificate from the server. This is enabled by default.
  • auth (tuple, optional) – The username and password to use when connecting to the server. If not provided, no auth header will be sent.

Examples

>>> # This example matches with the docstring of ``Server``
>>> from blaze import data
>>> c = Client('localhost:6363')
>>> t = data(c) 
add(name, resource_uri, *args, **kwargs)

Add the given resource URI to the Blaze server.

Parameters:
  • name (str) – The name to give the resource
  • resource_uri (str) – The URI string describing the resource to add to the server, e.g ‘sqlite:///path/to/file.db::table’
  • imports (list) – A list of string names for any modules that must be imported on the Blaze server before the resource can be added. This is identical to the imports field in a Blaze server YAML file.
  • args (any, optional) – Any additional positional arguments that can be passed to the blaze.resource constructor for this resource type
  • kwargs (any, optional) – Any additional keyword arguments that can be passed to the blaze.resource constructor for this resource type
dshape

The datashape of the client

class blaze.expr.collections.Concat(*args, **kwargs)

Stack tables on common columns

Parameters:
  • rhs (lhs,) – Collections to concatenate
  • axis (int, optional) – The axis to concatenate on.

Examples

>>> from blaze import symbol

Vertically stack tables:

>>> names = symbol('names', '5 * {name: string, id: int32}')
>>> more_names = symbol('more_names', '7 * {name: string, id: int32}')
>>> stacked = concat(names, more_names)
>>> stacked.dshape
dshape("12 * {name: string, id: int32}")

Vertically stack matrices:

>>> mat_a = symbol('a', '3 * 5 * int32')
>>> mat_b = symbol('b', '3 * 5 * int32')
>>> vstacked = concat(mat_a, mat_b, axis=0)
>>> vstacked.dshape
dshape("6 * 5 * int32")

Horizontally stack matrices:

>>> hstacked = concat(mat_a, mat_b, axis=1)
>>> hstacked.dshape
dshape("3 * 10 * int32")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.concat(lhs, rhs, axis=0)

Stack tables on common columns

Parameters:
  • rhs (lhs,) – Collections to concatenate
  • axis (int, optional) – The axis to concatenate on.

Examples

>>> from blaze import symbol

Vertically stack tables:

>>> names = symbol('names', '5 * {name: string, id: int32}')
>>> more_names = symbol('more_names', '7 * {name: string, id: int32}')
>>> stacked = concat(names, more_names)
>>> stacked.dshape
dshape("12 * {name: string, id: int32}")

Vertically stack matrices:

>>> mat_a = symbol('a', '3 * 5 * int32')
>>> mat_b = symbol('b', '3 * 5 * int32')
>>> vstacked = concat(mat_a, mat_b, axis=0)
>>> vstacked.dshape
dshape("6 * 5 * int32")

Horizontally stack matrices:

>>> hstacked = concat(mat_a, mat_b, axis=1)
>>> hstacked.dshape
dshape("3 * 10 * int32")
class blaze.expr.collections.Distinct(*args, **kwargs)

Remove duplicate elements from an expression

Parameters:on (tuple of Field) – The subset of fields or names of fields to be distinct on.

Examples

>>> from blaze import symbol
>>> t = symbol('t', 'var * {name: string, amount: int, id: int}')
>>> e = distinct(t)
>>> data = [('Alice', 100, 1),
...         ('Bob', 200, 2),
...         ('Alice', 100, 1)]
>>> from blaze.compute.python import compute
>>> sorted(compute(e, data))
[('Alice', 100, 1), ('Bob', 200, 2)]

Use a subset by passing on:

>>> import pandas as pd
>>> e = distinct(t, 'name')
>>> data = pd.DataFrame([['Alice', 100, 1],
...                      ['Alice', 200, 2],
...                      ['Bob', 100, 1],
...                      ['Bob', 200, 2]],
...                     columns=['name', 'amount', 'id'])
>>> compute(e, data)
    name  amount  id
0  Alice     100   1
1    Bob     100   1
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.distinct(expr, *on)

Remove duplicate elements from an expression

Parameters:on (tuple of Field) – The subset of fields or names of fields to be distinct on.

Examples

>>> from blaze import symbol
>>> t = symbol('t', 'var * {name: string, amount: int, id: int}')
>>> e = distinct(t)
>>> data = [('Alice', 100, 1),
...         ('Bob', 200, 2),
...         ('Alice', 100, 1)]
>>> from blaze.compute.python import compute
>>> sorted(compute(e, data))
[('Alice', 100, 1), ('Bob', 200, 2)]

Use a subset by passing on:

>>> import pandas as pd
>>> e = distinct(t, 'name')
>>> data = pd.DataFrame([['Alice', 100, 1],
...                      ['Alice', 200, 2],
...                      ['Bob', 100, 1],
...                      ['Bob', 200, 2]],
...                     columns=['name', 'amount', 'id'])
>>> compute(e, data)
    name  amount  id
0  Alice     100   1
1    Bob     100   1
class blaze.expr.collections.Head(*args, **kwargs)

First n elements of collection

Examples

>>> from blaze import symbol
>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.head(5).dshape
dshape("5 * {name: string, amount: int32}")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.head(child, n=10)

First n elements of collection

Examples

>>> from blaze import symbol
>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.head(5).dshape
dshape("5 * {name: string, amount: int32}")
class blaze.expr.collections.IsIn(*args, **kwargs)

Check if an expression contains values from a set.

Return a boolean expression indicating whether another expression contains values that are members of a collection.

Parameters:
  • expr (Expr) – Expression whose elements to check for membership in keys
  • keys (Sequence) – Elements to test against. Blaze stores this as a frozenset.

Examples

Check if a vector contains any of 1, 2 or 3:

>>> from blaze import symbol
>>> t = symbol('t', '10 * int64')
>>> expr = t.isin([1, 2, 3])
>>> expr.dshape
dshape("10 * bool")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.isin(expr, keys)

Check if an expression contains values from a set.

Return a boolean expression indicating whether another expression contains values that are members of a collection.

Parameters:
  • expr (Expr) – Expression whose elements to check for membership in keys
  • keys (Sequence) – Elements to test against. Blaze stores this as a frozenset.

Examples

Check if a vector contains any of 1, 2 or 3:

>>> from blaze import symbol
>>> t = symbol('t', '10 * int64')
>>> expr = t.isin([1, 2, 3])
>>> expr.dshape
dshape("10 * bool")
class blaze.expr.collections.Join(*args, **kwargs)

Join two tables on common columns

Parameters:
  • rhs (lhs,) – Expressions to join
  • on_left (str, optional) – The fields from the left side to join on. If no on_right is passed, then these are the fields for both sides.
  • on_right (str, optional) – The fields from the right side to join on.
  • how ({'inner', 'outer', 'left', 'right'}) – What type of join to perform.
  • suffixes (pair of str) – The suffixes to be applied to the left and right sides in order to resolve duplicate field names.

Examples

>>> from blaze import symbol
>>> names = symbol('names', 'var * {name: string, id: int}')
>>> amounts = symbol('amounts', 'var * {amount: int, id: int}')

Join tables based on shared column name

>>> joined = join(names, amounts, 'id')

Join based on different column names

>>> amounts = symbol('amounts', 'var * {amount: int, acctNumber: int}')
>>> joined = join(names, amounts, 'id', 'acctNumber')
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.join(lhs, rhs, on_left=None, on_right=None, how='inner', suffixes=('_left', '_right'))

Join two tables on common columns

Parameters:
  • rhs (lhs,) – Expressions to join
  • on_left (str, optional) – The fields from the left side to join on. If no on_right is passed, then these are the fields for both sides.
  • on_right (str, optional) – The fields from the right side to join on.
  • how ({'inner', 'outer', 'left', 'right'}) – What type of join to perform.
  • suffixes (pair of str) – The suffixes to be applied to the left and right sides in order to resolve duplicate field names.

Examples

>>> from blaze import symbol
>>> names = symbol('names', 'var * {name: string, id: int}')
>>> amounts = symbol('amounts', 'var * {amount: int, id: int}')

Join tables based on shared column name

>>> joined = join(names, amounts, 'id')

Join based on different column names

>>> amounts = symbol('amounts', 'var * {amount: int, acctNumber: int}')
>>> joined = join(names, amounts, 'id', 'acctNumber')
class blaze.expr.collections.Merge(*args, **kwargs)

Merge many fields together

Examples

>>> from blaze import symbol, label
>>> accounts = symbol('accounts', 'var * {name: string, x: int, y: real}')
>>> merge(accounts.name, z=accounts.x + accounts.y).fields
['name', 'z']

To control the ordering of the fields, use label:

>>> merge(label(accounts.name, 'NAME'), label(accounts.x, 'X')).dshape
dshape("var * {NAME: string, X: int32}")
>>> merge(label(accounts.x, 'X'), label(accounts.name, 'NAME')).dshape
dshape("var * {X: int32, NAME: string}")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.merge(*exprs, **kwargs)

Merge many fields together

Examples

>>> from blaze import symbol, label
>>> accounts = symbol('accounts', 'var * {name: string, x: int, y: real}')
>>> merge(accounts.name, z=accounts.x + accounts.y).fields
['name', 'z']

To control the ordering of the fields, use label:

>>> merge(label(accounts.name, 'NAME'), label(accounts.x, 'X')).dshape
dshape("var * {NAME: string, X: int32}")
>>> merge(label(accounts.x, 'X'), label(accounts.name, 'NAME')).dshape
dshape("var * {X: int32, NAME: string}")
class blaze.expr.collections.Sample(*args, **kwargs)

Random row-wise sample. Can specify n or frac for an absolute or fractional number of rows, respectively.

Examples

>>> from blaze import symbol
>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.sample(n=2).dshape
dshape("var * {name: string, amount: int32}")
>>> accounts.sample(frac=0.1).dshape
dshape("var * {name: string, amount: int32}")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.sample(child, n=None, frac=None)

Random row-wise sample. Can specify n or frac for an absolute or fractional number of rows, respectively.

Examples

>>> from blaze import symbol
>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.sample(n=2).dshape
dshape("var * {name: string, amount: int32}")
>>> accounts.sample(frac=0.1).dshape
dshape("var * {name: string, amount: int32}")
class blaze.expr.collections.Shift(*args, **kwargs)

Shift a column backward or forward by N elements

Parameters:
  • expr (Expr) – The expression to shift. This expression’s dshape should be columnar
  • n (int) – The number of elements to shift by. If n < 0 then shift backward, if n == 0 do nothing, else shift forward.
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.shift(expr, n)

Shift a column backward or forward by N elements

Parameters:
  • expr (Expr) – The expression to shift. This expression’s dshape should be columnar
  • n (int) – The number of elements to shift by. If n < 0 then shift backward, if n == 0 do nothing, else shift forward.
class blaze.expr.collections.Sort(*args, **kwargs)

Table in sorted order

Examples

>>> from blaze import symbol
>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.sort('amount', ascending=False).schema
dshape("{name: string, amount: int32}")

Some backends support sorting by arbitrary rowwise tables, e.g.

>>> accounts.sort(-accounts.amount) 
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.sort(child, key=None, ascending=True)

Sort a collection

Parameters:
  • key (str, list of str, or Expr) –

    Defines by what you want to sort.

    • A single column string: t.sort('amount')
    • A list of column strings: t.sort(['name', 'amount'])
    • An expression: t.sort(-t.amount)
  • ascending (bool, optional) – Determines order of the sort
class blaze.expr.collections.Tail(*args, **kwargs)

Last n elements of collection

Examples

>>> from blaze import symbol
>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.tail(5).dshape
dshape("5 * {name: string, amount: int32}")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.collections.tail(child, n=10)

Last n elements of collection

Examples

>>> from blaze import symbol
>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.tail(5).dshape
dshape("5 * {name: string, amount: int32}")
blaze.expr.collections.transform(t, replace=True, **kwargs)

Add named columns to table

>>> from blaze import symbol
>>> t = symbol('t', 'var * {x: int, y: int}')
>>> transform(t, z=t.x + t.y).fields
['x', 'y', 'z']
class blaze.expr.expressions.Apply(*args, **kwargs)

Apply an arbitrary Python function onto an expression

Examples

>>> t = symbol('t', 'var * {name: string, amount: int}')
>>> h = t.apply(hash, dshape='int64')  # Hash value of resultant dataset

You must provide the datashape of the result with the dshape= keyword. For datashape examples see http://datashape.pydata.org/grammar.html#some-simple-examples

If using a chunking backend and your operation may be safely split and concatenated then add the splittable=True keyword argument

>>> t.apply(f, dshape='...', splittable=True) 
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Cast(*args, **kwargs)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Coalesce(*args, **kwargs)

SQL like coalesce.

coalesce(a, b) = {
a if a is not NULL b otherwise

}

Examples

>>> coalesce(1, 2)
1
>>> coalesce(1, None)
1
>>> coalesce(None, 2)
2
>>> coalesce(None, None) is None
True
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Coerce(*args, **kwargs)

Coerce an expression to a different type.

Examples

>>> t = symbol('t', '100 * float64')
>>> t.coerce(to='int64')
t.coerce(to='int64')
>>> t.coerce('float32')
t.coerce(to='float32')
>>> t.coerce('int8').dshape
dshape("100 * int8")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.ElemWise(*args, **kwargs)

Elementwise operation.

The shape of this expression matches the shape of the child.

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Expr(*args, **kwargs)

Symbolic expression of a computation

All Blaze expressions (Join, By, Sort, ...) descend from this class. It contains shared logic and syntax. It in turn inherits from Node which holds all tree traversal logic

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Field(*args, **kwargs)

A single field from an expression.

Get a single field from an expression with record-type schema. We store the name of the field in the _name attribute.

Examples

>>> points = symbol('points', '5 * 3 * {x: int32, y: int32}')
>>> points.x.dshape
dshape("5 * 3 * int32")

For fields that aren’t valid Python identifiers, use [] syntax:

>>> points = symbol('points', '5 * 3 * {"space station": float64}')
>>> points['space station'].dshape
dshape("5 * 3 * float64")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Label(*args, **kwargs)

An expression with a name.

Examples

>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> expr = accounts.amount * 100
>>> expr._name
'amount'
>>> expr.label('new_amount')._name
'new_amount'
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Map(*args, **kwargs)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Projection(*args, **kwargs)

Select a subset of fields from data.

Examples

>>> accounts = symbol('accounts',
...                   'var * {name: string, amount: int, id: int}')
>>> accounts[['name', 'amount']].schema
dshape("{name: string, amount: int32}")
>>> accounts[['name', 'amount']]
accounts[['name', 'amount']]
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.ReLabel(*args, **kwargs)

Table with same content but with new labels

Examples

>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.schema
dshape("{name: string, amount: int32}")
>>> accounts.relabel(amount='balance').schema
dshape("{name: string, balance: int32}")
>>> accounts.relabel(not_a_column='definitely_not_a_column')
Traceback (most recent call last):
    ...
ValueError: Cannot relabel non-existent child fields: {'not_a_column'}
>>> s = symbol('s', 'var * {"0": int64}')
>>> s.relabel({'0': 'foo'})
s.relabel({'0': 'foo'})
>>> s.relabel(0='foo') 
Traceback (most recent call last):
    ...
SyntaxError: keyword can't be an expression

Notes

When names are not valid Python names, such as integers or string with spaces, you must pass a dictionary to relabel. For example

>>> s = symbol('s', 'var * {"0": int64}')
>>> s.relabel({'0': 'foo'})
s.relabel({'0': 'foo'})
>>> t = symbol('t', 'var * {"whoo hoo": ?float32}')
>>> t.relabel({"whoo hoo": 'foo'})
t.relabel({'whoo hoo': 'foo'})
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Selection(*args, **kwargs)

Filter elements of expression based on predicate

Examples

>>> accounts = symbol('accounts',
...                   'var * {name: string, amount: int, id: int}')
>>> deadbeats = accounts[accounts.amount < 0]
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.SimpleSelection(*args, **kwargs)

Internal selection class that does not treat the predicate as an input.

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Slice(*args, **kwargs)

Elements start until stop. On many backends, a step parameter is also allowed.

Examples

>>> from blaze import symbol
>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts[2:7].dshape
dshape("5 * {name: string, amount: int32}")
>>> accounts[2:7:2].dshape
dshape("3 * {name: string, amount: int32}")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.expressions.Symbol(name, dshape, token=0)

Symbolic data. The leaf of a Blaze expression

Examples

>>> points = symbol('points', '5 * 3 * {x: int, y: int}')
>>> points
<`points` symbol; dshape='5 * 3 * {x: int32, y: int32}'>
>>> points.dshape
dshape("5 * 3 * {x: int32, y: int32}")
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.expressions.apply(expr, func, dshape, splittable=False)

Apply an arbitrary Python function onto an expression

Examples

>>> t = symbol('t', 'var * {name: string, amount: int}')
>>> h = t.apply(hash, dshape='int64')  # Hash value of resultant dataset

You must provide the datashape of the result with the dshape= keyword. For datashape examples see http://datashape.pydata.org/grammar.html#some-simple-examples

If using a chunking backend and your operation may be safely split and concatenated then add the splittable=True keyword argument

>>> t.apply(f, dshape='...', splittable=True) 
blaze.expr.expressions.cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

blaze.expr.expressions.coalesce(a, b)

SQL like coalesce.

coalesce(a, b) = {
a if a is not NULL b otherwise

}

Examples

>>> coalesce(1, 2)
1
>>> coalesce(1, None)
1
>>> coalesce(None, 2)
2
>>> coalesce(None, None) is None
True
blaze.expr.expressions.coerce(expr, to)

Coerce an expression to a different type.

Examples

>>> t = symbol('t', '100 * float64')
>>> t.coerce(to='int64')
t.coerce(to='int64')
>>> t.coerce('float32')
t.coerce(to='float32')
>>> t.coerce('int8').dshape
dshape("100 * int8")
blaze.expr.expressions.label(expr, lab)

An expression with a name.

Examples

>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> expr = accounts.amount * 100
>>> expr._name
'amount'
>>> expr.label('new_amount')._name
'new_amount'
blaze.expr.expressions.ndim(expr)

Number of dimensions of expression

>>> symbol('s', '3 * var * int32').ndim
2
blaze.expr.expressions.projection(expr, names)

Select a subset of fields from data.

Examples

>>> accounts = symbol('accounts',
...                   'var * {name: string, amount: int, id: int}')
>>> accounts[['name', 'amount']].schema
dshape("{name: string, amount: int32}")
>>> accounts[['name', 'amount']]
accounts[['name', 'amount']]
blaze.expr.expressions.relabel(child, labels=None, **kwargs)

Table with same content but with new labels

Examples

>>> accounts = symbol('accounts', 'var * {name: string, amount: int}')
>>> accounts.schema
dshape("{name: string, amount: int32}")
>>> accounts.relabel(amount='balance').schema
dshape("{name: string, balance: int32}")
>>> accounts.relabel(not_a_column='definitely_not_a_column')
Traceback (most recent call last):
    ...
ValueError: Cannot relabel non-existent child fields: {'not_a_column'}
>>> s = symbol('s', 'var * {"0": int64}')
>>> s.relabel({'0': 'foo'})
s.relabel({'0': 'foo'})
>>> s.relabel(0='foo') 
Traceback (most recent call last):
    ...
SyntaxError: keyword can't be an expression

Notes

When names are not valid Python names, such as integers or string with spaces, you must pass a dictionary to relabel. For example

>>> s = symbol('s', 'var * {"0": int64}')
>>> s.relabel({'0': 'foo'})
s.relabel({'0': 'foo'})
>>> t = symbol('t', 'var * {"whoo hoo": ?float32}')
>>> t.relabel({"whoo hoo": 'foo'})
t.relabel({'whoo hoo': 'foo'})
blaze.expr.expressions.selection(table, predicate)

Filter elements of expression based on predicate

Examples

>>> accounts = symbol('accounts',
...                   'var * {name: string, amount: int, id: int}')
>>> deadbeats = accounts[accounts.amount < 0]
blaze.expr.expressions.symbol(*args, **kwargs)

Symbolic data. The leaf of a Blaze expression

Examples

>>> points = symbol('points', '5 * 3 * {x: int, y: int}')
>>> points
<`points` symbol; dshape='5 * 3 * {x: int32, y: int32}'>
>>> points.dshape
dshape("5 * 3 * {x: int32, y: int32}")
class blaze.expr.reductions.Reduction(_child, axis=None, keepdims=False)

A column-wise reduction

Blaze supports the same class of reductions as NumPy and Pandas.

sum, min, max, any, all, mean, var, std, count, nunique

Examples

>>> from blaze import symbol
>>> t = symbol('t', 'var * {name: string, amount: int, id: int}')
>>> e = t['amount'].sum()
>>> data = [['Alice', 100, 1],
...         ['Bob', 200, 2],
...         ['Alice', 50, 3]]
>>> from blaze.compute.python import compute
>>> compute(e, data)
350
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.reductions.Summary(_child, names, values, axis=None, keepdims=False)

A collection of named reductions

Examples

>>> from blaze import symbol
>>> t = symbol('t', 'var * {name: string, amount: int, id: int}')
>>> expr = summary(number=t.id.nunique(), sum=t.amount.sum())
>>> data = [['Alice', 100, 1],
...         ['Bob', 200, 2],
...         ['Alice', 50, 1]]
>>> from blaze import compute
>>> compute(expr, data)
(2, 350)
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.reductions.count(_child, axis=None, keepdims=False)

The number of non-null elements

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.reductions.nelements(_child, axis=None, keepdims=False)

Compute the number of elements in a collection, including missing values.

See also

blaze.expr.reductions.count
compute the number of non-null elements

Examples

>>> from blaze import symbol
>>> t = symbol('t', 'var * {name: string, amount: float64}')
>>> t[t.amount < 1].nelements()
nelements(t[t.amount < 1])
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.reductions.std(child, unbiased=False, *args, **kwargs)

Standard Deviation

Parameters:
  • child (Expr) – An expression
  • unbiased (bool, optional) –

    Compute the square root of an unbiased estimate of the population variance if this is True.

    Warning

    This does not return an unbiased estimate of the population standard deviation.

See also

var

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.reductions.summary(keepdims=False, axis=None, **kwargs)

A collection of named reductions

Examples

>>> from blaze import symbol
>>> t = symbol('t', 'var * {name: string, amount: int, id: int}')
>>> expr = summary(number=t.id.nunique(), sum=t.amount.sum())
>>> data = [['Alice', 100, 1],
...         ['Bob', 200, 2],
...         ['Alice', 50, 1]]
>>> from blaze import compute
>>> compute(expr, data)
(2, 350)
class blaze.expr.reductions.var(child, unbiased=False, *args, **kwargs)

Variance

Parameters:
  • child (Expr) – An expression
  • unbiased (bool, optional) – Compute an unbiased estimate of the population variance if this is True. In NumPy and pandas, this parameter is called ddof (delta degrees of freedom) and is equal to 1 for unbiased and 0 for biased.
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.reductions.vnorm(expr, ord=None, axis=None, keepdims=False)

Vector norm

See np.linalg.norm

class blaze.expr.arrays.Transpose(*args, **kwargs)

Transpose dimensions in an N-Dimensional array

Examples

>>> x = symbol('x', '10 * 20 * int32')
>>> x.T
transpose(x)
>>> x.T.shape
(20, 10)

Specify axis ordering with axes keyword argument

>>> x = symbol('x', '10 * 20 * 30 * int32')
>>> x.transpose([2, 0, 1])
transpose(x, axes=[2, 0, 1])
>>> x.transpose([2, 0, 1]).shape
(30, 10, 20)
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.arrays.TensorDot(*args, **kwargs)

Dot Product: Contract and sum dimensions of two arrays

>>> x = symbol('x', '20 * 20 * int32')
>>> y = symbol('y', '20 * 30 * int32')
>>> x.dot(y)
tensordot(x, y)
>>> tensordot(x, y, axes=[0, 0])
tensordot(x, y, axes=[0, 0])
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.arrays.dot(lhs, rhs)

Dot Product: Contract and sum dimensions of two arrays

>>> x = symbol('x', '20 * 20 * int32')
>>> y = symbol('y', '20 * 30 * int32')
>>> x.dot(y)
tensordot(x, y)
>>> tensordot(x, y, axes=[0, 0])
tensordot(x, y, axes=[0, 0])
blaze.expr.arrays.transpose(expr, axes=None)

Transpose dimensions in an N-Dimensional array

Examples

>>> x = symbol('x', '10 * 20 * int32')
>>> x.T
transpose(x)
>>> x.T.shape
(20, 10)

Specify axis ordering with axes keyword argument

>>> x = symbol('x', '10 * 20 * 30 * int32')
>>> x.transpose([2, 0, 1])
transpose(x, axes=[2, 0, 1])
>>> x.transpose([2, 0, 1]).shape
(30, 10, 20)
blaze.expr.arrays.tensordot(lhs, rhs, axes=None)

Dot Product: Contract and sum dimensions of two arrays

>>> x = symbol('x', '20 * 20 * int32')
>>> y = symbol('y', '20 * 30 * int32')
>>> x.dot(y)
tensordot(x, y)
>>> tensordot(x, y, axes=[0, 0])
tensordot(x, y, axes=[0, 0])
class blaze.expr.arithmetic.Arithmetic(lhs, rhs)

Super class for arithmetic operators like add or mul

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.math.notnull(child)

Return whether an expression is not null

Examples

>>> from blaze import symbol, compute
>>> s = symbol('s', 'var * int64')
>>> expr = notnull(s)
>>> expr.dshape
dshape("var * bool")
>>> list(compute(expr, [1, 2, None, 3]))
[True, True, False, True]
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.math.UnaryMath(child)

Mathematical unary operator with real valued dshape like sin, or exp

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.broadcast.Broadcast(*args, **kwargs)

Fuse scalar expressions over collections

Given elementwise operations on collections, e.g.

>>> from blaze import sin
>>> a = symbol('a', '100 * int')
>>> t = symbol('t', '100 * {x: int, y: int}')
>>> expr = sin(a) + t.y**2

It may be best to represent this as a scalar expression mapped over a collection

>>> sa = symbol('a', 'int')
>>> st = symbol('t', '{x: int, y: int}')
>>> sexpr = sin(sa) + st.y**2
>>> expr = Broadcast((a, t), (sa, st), sexpr)

This provides opportunities for optimized computation.

In practice, expressions are often collected into Broadcast expressions automatically. This class is mainly intented for internal use.

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.broadcast.scalar_symbols(exprs)

Gives a sequence of scalar symbols to mirror these expressions

Examples

>>> x = symbol('x', '5 * 3 * int32')
>>> y = symbol('y', '5 * 3 * int32')
>>> xx, yy = scalar_symbols([x, y])
>>> xx._name, xx.dshape
('x', dshape("int32"))
>>> yy._name, yy.dshape
('y', dshape("int32"))
blaze.expr.broadcast.broadcast_collect(expr, broadcastable=(<class 'blaze.expr.expressions.Map'>, <class 'blaze.expr.expressions.Field'>, <class 'blaze.expr.datetime.DateTime'>, <class 'blaze.expr.arithmetic.UnaryOp'>, <class 'blaze.expr.arithmetic.BinOp'>, <class 'blaze.expr.expressions.Coerce'>, <class 'blaze.expr.collections.Shift'>, <class 'blaze.expr.strings.Like'>, <class 'blaze.expr.strings.StrCat'>), want_to_broadcast=(<class 'blaze.expr.expressions.Map'>, <class 'blaze.expr.datetime.DateTime'>, <class 'blaze.expr.arithmetic.UnaryOp'>, <class 'blaze.expr.arithmetic.BinOp'>, <class 'blaze.expr.expressions.Coerce'>, <class 'blaze.expr.collections.Shift'>, <class 'blaze.expr.strings.Like'>, <class 'blaze.expr.strings.StrCat'>), no_recurse=None)

Collapse expression down using Broadcast - Tabular cases only

Expressions of type Broadcastables are swallowed into Broadcast operations

>>> t = symbol('t', 'var * {x: int, y: int, z: int, when: datetime}')
>>> expr = (t.x + 2*t.y).distinct()
>>> broadcast_collect(expr)
distinct(Broadcast(_children=(t,), _scalars=(t,), _scalar_expr=t.x + (2 * t.y)))
>>> from blaze import exp
>>> expr = t.x + 2 * exp(-(t.x - 1.3) ** 2)
>>> broadcast_collect(expr)
Broadcast(_children=(t,), _scalars=(t,), _scalar_expr=t.x + (2 * (exp(-((t.x - 1.3) ** 2)))))
class blaze.expr.datetime.DateTime(*args, **kwargs)

Superclass for datetime accessors

cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

class blaze.expr.split_apply_combine.By(*args, **kwargs)

Split-Apply-Combine Operator

Examples

>>> from blaze import symbol
>>> t = symbol('t', 'var * {name: string, amount: int, id: int}')
>>> e = by(t['name'], total=t['amount'].sum())
>>> data = [['Alice', 100, 1],
...         ['Bob', 200, 2],
...         ['Alice', 50, 3]]
>>> from blaze.compute.python import compute
>>> sorted(compute(e, data))
[('Alice', 150), ('Bob', 200)]
cast(expr, to)

Cast an expression to a different type.

This is only an expression time operation.

Examples

>>> s = symbol('s', '?int64')
>>> s.cast('?int32').dshape
dshape("?int32")

# Cast to correct mislabeled optionals >>> s.cast(‘int64’).dshape dshape(“int64”)

# Cast to give concrete dimension length >>> t = symbol(‘t’, ‘var * float32’) >>> t.cast(‘10 * float32’).dshape dshape(“10 * float32”)

isidentical(a, b)

Strict equality testing

Different from x == y -> Eq(x, y)

>>> isidentical(1, 1)
True
>>> from blaze.expr import symbol
>>> x = symbol('x', 'int')
>>> isidentical(x, 1)
False
>>> isidentical(x + 1, x + 1)
True
>>> isidentical(x + 1, x + 2)
False
>>> isidentical((x, x + 1), (x, x + 1))
True
>>> isidentical((x, x + 1), (x, x + 2))
False
map(func, schema=None, name=None)

Map an arbitrary Python function across elements in a collection

Examples

>>> from datetime import datetime
>>> t = symbol('t', 'var * {price: real, time: int64}')  # times as integers
>>> datetimes = t.time.map(datetime.utcfromtimestamp)

Optionally provide extra schema information

>>> datetimes = t.time.map(datetime.utcfromtimestamp,
...                           schema='{time: datetime}')

See also

blaze.expr.expresions.Apply()

blaze.expr.split_apply_combine.count_values(expr, sort=True)

Count occurrences of elements in this column

Sort by counts by default Add sort=False keyword to avoid this behavior.