tinygrad device

Note

You likely want the upstream tinygrad, not tinygrab. Tinygrab contains AI generated docstrings for a tinygrad snapshot. Upstream: https://tinygrad.org

class tinygrad.device.Allocator[source]

Bases: object

This is a class for an Allocator. It has methods to allocate, free and copy memory.

None
alloc(size: int)[source]

The alloc method checks if the given size is positive. If it’s not, an assertion error will be raised. Then, it calls the private _alloc method to perform the actual allocation.

Parameters:

size (int) – The size of memory to allocate in bytes. It must be a positive integer.

Returns:

The result from the private _alloc method.

Raises:

AssertionError – If the given size is not positive.

copyin(dest, src: memoryview)[source]

The copyin method copies data from a memoryview object (src) to an allocated memory block (dest). It raises a NotImplementedError because it should be implemented by a child class.

Parameters:
  • dest – The destination allocated memory block where the data will be copied to.

  • src (memoryview) – The source memoryview object from where the data will be copied from.

Returns:

None

Raises:

NotImplementedError – This method is not implemented and should be overridden by a child class.

copyout(dest: memoryview, src)[source]

The copyout method copies data from an allocated memory block (src) to a memoryview object (dest). It raises a NotImplementedError because it should be implemented by a child class.

Parameters:
  • dest (memoryview) – The destination memoryview object where the data will be copied to.

  • src – The source allocated memory block from where the data will be copied from.

Returns:

None

Raises:

NotImplementedError – This method is not implemented and should be overridden by a child class.

free(opaque, size: int)[source]

The free method calls the private _free method to perform the actual freeing of memory. In some cases, if you are returning a Python object, you don’t need a free method and it can be a no-op.

Parameters:
  • opaque – Some data used for identifying the memory block to be freed. Its type or content depends on the implementation.

  • size (int) – The size of memory to be freed in bytes. It must be a positive integer.

Returns:

None

class tinygrad.device.Buffer(device: str, size: int, dtype: DType, opaque: Any = None)[source]

Bases: object

This class represents a buffer that can be used to store data of a certain size and type. It also provides methods for copying data into and out of the buffer, as well as converting the buffer to a numpy array. The buffer is allocated using an allocator specific to the device it will be used with.

copyin(mv: memoryview)[source]

Copy data from a memory view into the buffer.

Parameters:

mv – The memory view to copy data from.

Returns:

The buffer.

static fromCPU(device: str, x: ndarray)[source]

Create a new buffer and copy data from a numpy array into it.

Parameters:
  • device – The name of the device that the buffer will be used with.

  • x – The numpy array to copy data from.

Returns:

The newly created buffer.

toCPU() ndarray[source]

Converts the data from GPU to CPU.

Checks if the allocator has a method ‘as_buffer’. If it does, then the buffer is copied into a NumPy array with zero copy by using the ‘frombuffer’ function. Otherwise, an empty NumPy array of size same as self.size and datatype same as self.dtype is created. The data from GPU buffer is then copied to this new NumPy array.

Returns:

A Numpy array containing the data which was previously on GPU.

class tinygrad.device.Compiled(allocator: Allocator, linearizer_opts: LinearizerOptions, renderer, compiler, runtime, graph=None)[source]

Bases: object

The Compiled class is responsible for compiling and executing the given AST (Abstract Syntax Tree). It takes in an allocator, linearizer options, renderer, compiler, runtime, and optionally a graph.

Parameters:
  • allocator (Allocator) – Memory allocator for the compiled code.

  • linearizer_opts (LinearizerOptions) – Options for the linearizer.

  • renderer – Renderer object to convert the AST into an executable format.

  • compiler – Compiler used to compile the rendered code.

  • runtime – Runtime environment for executing the compiled code.

  • graph – (Optional) Graph to be compiled. Default is None.

get_linearizer(ast: LazyOp) Linearizer[source]

Optimizes the given AST using a series of optimization techniques and returns the optimized linearized version.

Parameters:

ast (LazyOp) – The abstract syntax tree to be optimized.

Returns:

An instance of Linearizer with the optimized code.

get_runner(ast: LazyOp) CompiledASTRunner[source]

A cached version of the to_program function that takes an AST and returns a runner for it.

Parameters:

ast (LazyOp) – The abstract syntax tree to be executed.

Returns:

An instance of CompiledASTRunner with the compiled code.

synchronize()[source]

This function is a placeholder for device-specific synchronization code. It should be overridden in the derived class with specific implementation for the desired device.

to_program(k: Linearizer) CompiledASTRunner[source]

Converts a linearized AST into an executable format using the renderer and then compiles it using the compiler and runtime environment.

Parameters:

k (Linearizer) – The linearized AST to be converted.

Returns:

An instance of CompiledASTRunner with the compiled code.

class tinygrad.device.CompiledASTRunner(ast: LazyOp | None, name: str, prg: str, global_size: List[int] | None = None, local_size: List[int] | None = None, runtime_args: dict | None = None)[source]

Bases: JITRunner

A class for running compiled code generated from an Abstract Syntax Tree (AST). Inherits from JITRunner.

build(compiler, runtime)[source]

Builds the kernel from source code.

Parameters:
  • compiler (Function) – The compiler used to compile the kernel.

  • runtime (Function) – The runtime used to execute the kernel.

Returns:

Returns an instance of CompiledASTRunner.

Return type:

CompiledASTRunner

launch_dims(var_vals)[source]

Computes the launch dimensions for the kernel execution.

Parameters:

var_vals (Dict[Variable, int]) – Values of the variables used in the kernel.

Returns:

Returns the global and local sizes for the kernel execution.

Return type:

Tuple[List[int], List[int]]

class tinygrad.device.Interpreted(allocator: Allocator, fxn_for_op: Dict[UnaryOps | BinaryOps | ReduceOps | MovementOps | LoadOps | TernaryOps | BufferOps, Callable])[source]

Bases: object

The main class of the interpreter. This class is responsible for handling the allocation and execution of operations. It uses an allocator to manage memory and a dictionary of callable functions to perform various operations.

allocator

Object used to allocate and deallocate memory.

Type:

Allocator

fxn_for_op

A dictionary mapping operation types to their corresponding callable function.

Type:

Dict[Op, Callable]

synchronize

A placeholder function that does nothing. It is intended to be replaced with a proper synchronization mechanism in the future.

Type:

function

codegen

Placeholder for code generation functionality. Currently set to None.

Type:

None

graph

Placeholder for computation graph functionality. Currently set to None.

Type:

None

get_runner(ast: LazyOp) InterpretedASTRunner[source]

Retrieves a runner function for the given abstract syntax tree (AST). The function is retrieved from the fxn_for_op dictionary.

Parameters:

ast (LazyOp) – The abstract syntax tree to be executed.

Returns:

A callable function that can execute the given AST.

Return type:

InterpretedASTRunner

class tinygrad.device.InterpretedASTRunner(ast: LazyOp, fxn: Callable)[source]

Bases: JITRunner

This class is used to run interpreted Abstract Syntax Trees (ASTs). It inherits from JITRunner.

fxn

A callable function

op_estimate

Estimated number of floating point operations required for the operation

mem_estimate

Estimated memory requirement for the operation

class tinygrad.device.JITRunner[source]

Bases: object

This class defines a Just-In-Time (JIT) runner that is responsible for executing operations and caching the results. The primary method of interest here is the ‘exec’ method, which takes in a list of Buffer objects and an optional dictionary of variable values. It returns an estimated time of execution.

exec(rawbufs: List[Buffer], var_vals: Dict[Variable, int] | None = None) float | None[source]

This method is responsible for executing the operations associated with a given list of Buffer objects and an optional dictionary of variable values. It first checks if ‘var_vals’ is not None; if it is, an empty dictionary is created. Then, it imports CacheCollector and adds the current JITRunner object along with the buffer and variable dictionaries to the cache.

Parameters:
  • rawbufs – A list of Buffer objects to be executed.

  • var_vals – An optional dictionary containing variable values. Default is None.

Returns:

An estimated time of execution as a float value or None if not available.

class tinygrad.device.LRUAllocator[source]

Bases: Allocator

This class defines an Allocator that uses the Least Recently Used (LRU) strategy to manage memory allocation and deallocation. It is a subclass of the Allocator parent class.

alloc(size: int)[source]

This method is used to allocate a block of memory of the specified ‘size’. If there are any available opaque objects in the cache for this size, we return one from there. Otherwise, we try to allocate memory by calling super().alloc(size). If that raises a MemoryError exception, we free up some memory by calling self.free_cache() and then retry allocating memory using super().alloc(size) again.

free(opaque: Any, size: int)[source]

This method is used to deallocate a block of memory specified by the ‘size’ and corresponding ‘opaque’ object. If the LRU environment variable “LRU” is set to 1 (default), we append the opaque object to the cache for future use; otherwise, we directly free up this memory block by calling self._free(opaque).

free_cache()[source]

This method is used to free up some memory by calling the _free() method for each opaque object in the cache, then clearing the cache.

tinygrad.device.update_stats(name: str, op_estimate: Node | int, mem_estimate: Node | int, var_vals: Dict[Variable, int] | None, et: float | None, buf_count, jit=False, num_kernels=1, lra: Dict | None = None)[source]

This function updates the global counters for operations and memory usage, as well as prints debugging information if the DEBUG level is 2 or higher.

Parameters:
  • name – The name of the operation being executed.

  • op_estimate – An estimated number of operations to be executed.

  • mem_estimate – An estimated amount of memory to be used.

  • var_vals – An optional dictionary containing variable values. Default is None.

  • et – An optional estimated time of execution in seconds. Default is None.

  • buf_count – The number of buffers (i.e., the argument count) associated with the operation.

  • jit – An optional boolean flag indicating whether to use Just-In-Time (JIT) compilation. Default is False.

  • num_kernels – The number of kernels used in the operation. Default is 1.

  • lra – An optional dictionary containing local and global size information for the operation. Default is None.