tinygrad renderer.cstyle

Note

You likely want the upstream tinygrad, not tinygrab. Tinygrab contains AI generated docstrings for a tinygrad snapshot. Upstream: https://tinygrad.org

class tinygrad.renderer.cstyle.CStyleLanguage(size_prefix: str = 'int', generic_var_prefix: str = '', kernel_prefix: str = '', buffer_prefix: str = '', buffer_suffix: str = '', smem_align: str = '', smem_prefix: str = '', smem_prefix_for_cast: bool = True, arg_int_prefix: str = '', barrier: str = '', xid: ~typing.List[str] = [], gid: ~typing.List[str] = [], lid: ~typing.List[str] = [], global_max: ~typing.List[int] = [], local_max: ~typing.List[int] = [], extra_args: ~typing.List[str] = [], float4: str | None = None, half_prekernel: str | None = None, uses_vload: bool = False, external_local_bufs: bool = False, uses_ptr_arithmetic: bool = False, launch_bounds: bool = False, code_for_op: ~typing.Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CStyleLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function CStyleLanguage.<lambda>>, TernaryOps.WHERE: <function CStyleLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>})[source]

Bases: NamedTuple

Class representing a C-style programming language.

size_prefix

Prefix for size. Defaults to “int”.

Type:: str

generic_var_prefix

Prefix for generic variables. Defaults to empty string.

Type:: str

kernel_prefix

Prefix for kernels. Defaults to empty string.

Type:: str

buffer_prefix

Prefix for buffers. Defaults to empty string.

Type:: str

buffer_suffix

Suffix for buffers. Defaults to empty string.

Type:: str

smem_align

Alignment for shared memory. Defaults to empty string.

Type:: str

smem_prefix

Prefix for shared memory. Defaults to empty string.

Type:: str

smem_prefix_for_cast

Indicates whether prefix should be used in casts. Defaults to True.

Type:: bool

arg_int_prefix

Prefix for integer arguments. Defaults to empty string.

Type:: str

barrier

Barrier synchronization method. Defaults to empty string.

Type:: str

xid

List of x identifiers. Defaults to empty list.

Type:: List[str]

gid

List of g identifiers. Defaults to empty list.

Type:: List[str]

lid

List of l identifiers. Defaults to empty list.

Type:: List[str]

global_max

List of maximum global values. Defaults to empty list.

Type:: List[int]

local_max

List of maximum local values. Defaults to empty list.

Type:: List[int]

extra_args

List of extra arguments. Defaults to empty list.

Type:: List[str]

float4

Float4 value with None as default.

Type:: Optional[str]

half_prekernel

Half pre-kernel value with None as default.

Type:: Optional[str]

uses_vload

Indicates whether vload is used. Defaults to False.

Type:: bool

external_local_bufs

Indicates whether external local buffers are used. Defaults to False.

Type:: bool

uses_ptr_arithmetic

Indicates whether pointer arithmetic is used. Defaults to False.

Type:: bool

launch_bounds

Indicates whether launch bounds are used. Defaults to False.

Type:: bool

code_for_op

Dictionary containing operations for unary, binary, and ternary ops.

Type:: Dict

arg_int_prefix: str: Alias for field number 8

barrier: str: Alias for field number 9

buffer_prefix: str: Alias for field number 3

buffer_suffix: str: Alias for field number 4

code_for_op: Dict: Alias for field number 22

external_local_bufs: bool: Alias for field number 19

extra_args: List[str]: Alias for field number 15

float4: str | None: Alias for field number 16

generic_var_prefix: str: Alias for field number 1

gid: List[str]: Alias for field number 11

global_max: List[int]: Alias for field number 13

half_prekernel: str | None: Alias for field number 17

kernel_prefix: str: Alias for field number 2

launch_bounds: bool: Alias for field number 21

lid: List[str]: Alias for field number 12

local_max: List[int]: Alias for field number 14

render_cast(x: List[str], var_dtype: DType) → str[source]

Returns a string expression of the casted xs with the given type.

self

The instance of the class.

Type:: Any

x

A list of strings to be casted.

Type:: List[str]

var_dtype

The datatype into which the elements of x are to be casted.

Type:: DType

Returns:: String expression of the casted xs with the given type.
Return type:: str
Raises:: AssertionError – If length of x is not equal to var_dtype.sz or if float4 attribute is None.

render_conditional(cond: str, x: str, y: str) → str[source]

Render a conditional expression.

self

The object itself

Type:: Any

cond

The condition of the conditional expression

Type:: str

x

The value to return if the condition is true

Type:: str

y

The value to return if the condition is false

Type:: str

Returns:: The rendered conditional expression
Return type:: str

render_const(x: float | int | bool, var_dtype) → str[source]

Returns a string expression of the constant with the given type.

x

The input value

Type:: Union[float, int, bool]

var_dtype: The data type of the variable

Returns:: A string representation of the constant
Return type:: str

render_for(expr: str, _min: int | str, _max: int | str) → str[source]

Render a for loop.

self

The object itself

Type:: Any

expr

The loop variable expression

Type:: str

_min

The start value of the loop variable

Type:: Union[int, str]

_max

The end value of the loop variable

Type:: Union[int, str]

Returns:: The rendered for loop
Return type:: str

render_if(cond: str)[source]

Render an if statement.

self

The object itself

Type:: Any

cond

The condition of the if statement

Type:: str

Returns:: The rendered if statement
Return type:: str

render_kernel(function_name: str, kernel: List[str], bufs: List[Tuple[str, DType]], local_size: List[int], prekernel: List[str]) → str[source]

Render the kernel with given parameters.

This function generates a complete OpenCL kernel program based on the provided arguments. It creates the necessary boilerplate code for the kernel, including sampler creation and buffer type definitions, then concatenates the actual kernel code and post-processing code if needed.

self

The instance of the OpenCLKernelBuilder class.

Type:: OpenCLKernelBuilder

function_name

The name of the kernel function.

Type:: str

kernel

The list of strings that form the body of the kernel function.

Type:: List[str]

bufs

A list of tuples where each tuple contains a buffer name and its data type.

Type:: List[Tuple[str, DType]]

local_size

The local size for the kernel execution.

Type:: List[int]

prekernel

Pre-kernel code that will be inserted before the actual kernel code.

Type:: List[str]

Returns:: The complete OpenCL kernel program as a string.
Return type:: str

render_load(output_dtype, buf_name, buf_dtype, idx, local=False) → str[source]

Returns a string expression of the loaded value with the output type.

output_dtype: The output data type

buf_name

The buffer name

Type:: str

buf_dtype: The data type of the buffer

idx

The index in the buffer

Type:: int

local

Whether the buffer is local or not

Type:: bool

Returns:: A string representation of the loaded value
Return type:: str

render_local(name: str, size: int)[source]

Render the local memory variable declaration.

self

The object itself

Type:: Any

name

The name of the local memory variable

Type:: str

size

The size of the local memory variable array

Type:: int

Returns:: The rendered local memory variable declaration
Return type:: str

render_store(buf_name: str, buf_dtype: DType, var_name: str, var_dtype: DType, idx: str, local=False) → str[source]

Returns a string statement that performs the store operation.

self

The instance of the class.

Type:: Any

buf_name

The buffer name.

Type:: str

buf_dtype

The buffer data type.

Type:: DType

var_name

The variable name.

Type:: str

var_dtype

The variable data type.

Type:: DType

idx

Index of the operation.

Type:: str

local

Whether the operation is local or not. Defaults to False.

Type:: bool, optional

Returns:: String statement that performs the store operation based on the given parameters.
Return type:: str

size_prefix: str: Alias for field number 0

smem_align: str: Alias for field number 5

smem_prefix: str: Alias for field number 6

smem_prefix_for_cast: bool: Alias for field number 7

uses_ptr_arithmetic: bool: Alias for field number 20

uses_vload: bool: Alias for field number 18

xid: List[str]: Alias for field number 10

class tinygrad.renderer.cstyle.CUDALanguage(size_prefix: str = 'int', generic_var_prefix: str = '', kernel_prefix: str = '', buffer_prefix: str = '', buffer_suffix: str = '', smem_align: str = '', smem_prefix: str = '', smem_prefix_for_cast: bool = True, arg_int_prefix: str = '', barrier: str = '', xid: ~typing.List[str] = [], gid: ~typing.List[str] = [], lid: ~typing.List[str] = [], global_max: ~typing.List[int] = [], local_max: ~typing.List[int] = [], extra_args: ~typing.List[str] = [], float4: str | None = None, half_prekernel: str | None = None, uses_vload: bool = False, external_local_bufs: bool = False, uses_ptr_arithmetic: bool = False, launch_bounds: bool = False, code_for_op: ~typing.Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CStyleLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function CStyleLanguage.<lambda>>, TernaryOps.WHERE: <function CStyleLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>})[source]

Bases: CStyleLanguage

The CUDALanguage class is a subclass of the CStyleLanguage class, specifically tailored for CUDA GPU programming.

kernel_prefix

Prefix for defining kernels in CUDA.

Type:: str

smem_prefix

Prefix for shared memory variables in CUDA.

Type:: str

smem_prefix_for_cast

Flag to indicate whether prefix should be used with casting for shared memory.

Type:: bool

arg_int_prefix

Prefix for integer arguments in CUDA kernel functions.

Type:: str

barrier

Code snippet for thread synchronization in CUDA.

Type:: str

float4

Function name for creating a 4-component floating point number in CUDA.

Type:: str

gid

List of strings representing the global index in each dimension (x, y, z).

Type:: list

lid

List of strings representing the local index in each dimension (x, y, z).

Type:: list

xid

List of strings representing the combined global and local indices for each dimension (x, y, z).

Type:: list

code_for_op

Dictionary mapping binary operations to their corresponding CUDA code.

Type:: dict

half_prekernel

Pre-kernel code needed when working with half-precision floating point numbers in CUDA.

Type:: str

arg_int_prefix: str = 'const int'

barrier: str = '__syncthreads();'

code_for_op: Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CUDALanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function CStyleLanguage.<lambda>>, TernaryOps.WHERE: <function CStyleLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>}

float4: str | None = 'make_float4'

gid: List[str] = ['blockIdx.x', 'blockIdx.y', 'blockIdx.z']

half_prekernel: str | None = '\n #include <cuda_fp16.h>\n struct half4 { half x, y, z, w; };\n __device__ half4 make_half4(half x, half y, half z, half w) { half4 ret; ret.x = x; ret.y = y; ret.z = z; ret.w = w; return ret; }\n '

kernel_prefix: str = '#define INFINITY (__int_as_float(0x7f800000))\n#define NAN (__int_as_float(0x7fffffff))\nextern "C" __global__ '

lid: List[str] = ['threadIdx.x', 'threadIdx.y', 'threadIdx.z']

smem_prefix: str = '__shared__ '

smem_prefix_for_cast: bool = False

xid: List[str] = ['(blockIdx.x*blockDim.x+threadIdx.x)', '(blockIdx.y*blockDim.y+threadIdx.y)', '(blockIdx.z*blockDim.z+threadIdx.z)']

tinygrad.renderer.cstyle.CUDARenderer(function_name: str, uops: List[UOp]) → Tuple[str, Dict]

Converts a list of micro-operations (uops) to the specified C-style language.

Parameters:

lang (CStyleLanguage) – The target C-style programming language for conversion.
function_name (str) – The name of the function being converted.
uops (List[UOp]) – A list of micro-operations to convert.

Returns:

A tuple containing the converted C-style code and a dictionary.

Return type:

Tuple[str, Dict]

Attributes:

local_size (List[int]): Holds the size of local variables.
kernel, prekernel, bufs (List[]): Lists for storing generated code and buffers.
depth (int): The indentation level for the generated code. Default is 1.
kk: A helper function to append lines of code with proper indentation.
c, r (DefaultDict[str, int], Dict): Counters and mappings for temporary variables.
ssa: A helper function to generate single static assignment (SSA) form representations.
child_count (DefaultDict[UOp, int]): A counter for the number of children each uop has.

class tinygrad.renderer.cstyle.HIPLanguage(size_prefix: str = 'int', generic_var_prefix: str = '', kernel_prefix: str = '', buffer_prefix: str = '', buffer_suffix: str = '', smem_align: str = '', smem_prefix: str = '', smem_prefix_for_cast: bool = True, arg_int_prefix: str = '', barrier: str = '', xid: ~typing.List[str] = [], gid: ~typing.List[str] = [], lid: ~typing.List[str] = [], global_max: ~typing.List[int] = [], local_max: ~typing.List[int] = [], extra_args: ~typing.List[str] = [], float4: str | None = None, half_prekernel: str | None = None, uses_vload: bool = False, external_local_bufs: bool = False, uses_ptr_arithmetic: bool = False, launch_bounds: bool = False, code_for_op: ~typing.Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CStyleLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function CStyleLanguage.<lambda>>, TernaryOps.WHERE: <function CStyleLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>})[source]

Bases: CStyleLanguage

HIPLanguage class that inherits from CStyleLanguage.

kernel_prefix

Kernel prefix containing various function definitions and necessary includes.

Type:: str

launch_bounds

A boolean indicating if the language supports launch bounds.

Type:: bool

smem_prefix

The shared memory prefix for the language.

Type:: str

smem_prefix_for_cast

A boolean indicating whether a cast is required for the shared memory prefix.

Type:: bool

barrier

The barrier synchronization primitive for the language.

Type:: str

float4

The name of the float4 type for the language.

Type:: str

uses_vload

Whether the language uses vload.

Type:: bool

uses_ptr_arithmetic

Whether the language uses pointer arithmetic.

Type:: bool

arg_int_prefix

The integer prefix for function arguments.

Type:: str

gid

A list comprising strings representing the grid index in three dimensions.

Type:: list

lid

A list comprising strings representing the local thread index in three dimensions.

Type:: list

xid

A list comprising strings representing the extended thread index in three dimensions.

Type:: list

code_for_op

A dictionary containing lambda functions for different operations and their

Type:: dict

corresponding implementations based on data type.

arg_int_prefix: str = 'const int'

barrier: str = '__syncthreads();'

code_for_op: Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function HIPLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function CStyleLanguage.<lambda>>, TernaryOps.WHERE: <function HIPLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>}

float4: str | None = 'make_float4'

gid: List[str] = ['blockIdx.x', 'blockIdx.y', 'blockIdx.z']

half_prekernel: str | None = '#include <hip/hip_fp16.h>\n\ntypedef union { struct { half x, y, z, w; } __attribute__((aligned(8))); half data[4]; } half4; __device__ half4 make_half4(half x, half y, half z, half w) { return {x, y, z, w}; }\ntypedef union { struct { half x, y, z, w, a, b, c, d; } __attribute__((aligned(16))); half data[8]; } half8; __device__ half8 make_half8(half x, half y, half z, half w, half a, half b, half c, half d) { return {x, y, z, w, a, b, c, d}; }\n typedef _Float16 half16 __attribute__((ext_vector_type(16))); __device__ half16 make_half16(half x, half y, half z, half w, half a, half b, half c, half d, half e, half f, half g, half h, half i, half j, half k, half l) { return {x, y, z, w, a, b, c, d, e, f, g, h, i, j, k, l}; }\n__device__ float vload_half(size_t offset, const half *p) { return (float)*(p + offset); }\n__device__ float2 vload_half2(size_t offset, const half *p) { return make_float2((float)*(p + offset*2), (float)*(p + offset*2 + 1)); }\n__device__ float4 vload_half4(size_t offset, const half *p) { return make_float4((float)*(p + offset*4), (float)*(p + offset*4 + 1), (float)*(p + offset*4 + 2), (float)*(p + offset*4 + 3)); }\n__device__ void vstore_half(float data, size_t offset, half *p) { *(p + offset) = (half)data; }\n__device__ void vstore_half2(float2 data, size_t offset, half *p) { *(p + offset*2) = (half)data.x; *(p + offset*2 + 1) = (half)data.y; }\n__device__ void vstore_half4(float4 data, size_t offset, half *p) { *(p + offset*4) = (half)data.x; *(p + offset*4 + 1) = (half)data.y; *(p + offset*4 + 2) = (half)data.z; *(p + offset*4 + 3) = (half)data.w; }\n__device__ half exp2(half x) { return hexp2(x); }\n__device__ half log2(half x) { return hlog2(x); }\n__device__ half sin(half x) { return hsin(x); }\n__device__ half sqrt(half x) { return hsqrt(x); }\n__device__ half hmax(half a, half b) { return __hgt(a, b) ? a : b; }\n__device__ half operator%(const half &a, const half &b) { return __hsub(a, __hmul(b, __float2half(floorf(__half2float(a) / __half2float(b))))); }\n__device__ bool operator!=(const half &a, const int &b) { return (float)a != b; }\n\n// HACKS for ALU ops on half and result of half2 GEP\n__device__ half operator+(const half &a, const unsigned short &b) { return __hadd(a, (half)(b)); }\n__device__ half operator-(const half &a, const unsigned short &b) { return __hsub(a, (half)(b)); }\n__device__ half operator*(const half &a, const unsigned short &b) { return __hmul(a, (half)(b)); }\n__device__ half operator/(const half &a, const unsigned short &b) { return __hdiv(a, (half)(b)); }\n__device__ bool operator<(const half &a, const unsigned short &b) { return __hlt(a, (half)(b)); }\n// now the other way\n__device__ half operator+(const unsigned short &a, const half &b) { return __hadd((half)(a), b); }\n__device__ half operator-(const unsigned short &a, const half &b) { return __hsub((half)(a), b); }\n__device__ half operator*(const unsigned short &a, const half &b) { return __hmul((half)(a), b); }\n__device__ half operator/(const unsigned short &a, const half &b) { return __hdiv((half)(a), b); }\n__device__ bool operator<(const unsigned short &a, const half &b) { return __hlt((half)(a), b); }\n '

kernel_prefix: str = '#include <hip/hip_common.h>\n#define INFINITY (__builtin_inff())\n#define NAN (__builtin_nanf(""))\n __device__ float4 max(float4 x, float4 y) { return float4(max(x.x, y.x), max(x.y, y.y), max(x.z, y.z), max(x.w, y.w)); }\n __device__ float4 pow(float x, float4 y) { return float4(pow(x, y.x), pow(x, y.y), pow(x, y.z), pow(x, y.w)); }\n __device__ float4 pow(float4 x, float4 y) { return float4(pow(x.x, y.x), pow(x.y, y.y), pow(x.z, y.z), pow(x.w, y.w)); }\n __device__ float4 log2(float4 x) { return float4(log2(x.x), log2(x.y), log2(x.z), log2(x.w)); }\n __device__ float4 exp2(float4 x) { return float4(exp2(x.x), exp2(x.y), exp2(x.z), exp2(x.w)); }\n __device__ float4 sin(float4 x) { return float4(sin(x.x), sin(x.y), sin(x.z), sin(x.w)); }\n typedef float float8 __attribute__((ext_vector_type(8))); __device__ float8 make_float8(float x, float y, float z, float w, float a, float b, float c, float d) { return {x, y, z, w, a, b, c, d}; }\n extern "C" __global__\n '

launch_bounds: bool = True

lid: List[str] = ['threadIdx.x', 'threadIdx.y', 'threadIdx.z']

smem_prefix: str = '__shared__ '

smem_prefix_for_cast: bool = False

uses_ptr_arithmetic: bool = True

uses_vload: bool = True

xid: List[str] = ['(blockIdx.x*blockDim.x+threadIdx.x)', '(blockIdx.y*blockDim.y+threadIdx.y)', '(blockIdx.z*blockDim.z+threadIdx.z)']

tinygrad.renderer.cstyle.HIPRenderer(function_name: str, uops: List[UOp]) → Tuple[str, Dict]

Converts a list of micro-operations (uops) to the specified C-style language.

Parameters:

lang (CStyleLanguage) – The target C-style programming language for conversion.
function_name (str) – The name of the function being converted.
uops (List[UOp]) – A list of micro-operations to convert.

Returns:

A tuple containing the converted C-style code and a dictionary.

Return type:

Tuple[str, Dict]

Attributes:

local_size (List[int]): Holds the size of local variables.
kernel, prekernel, bufs (List[]): Lists for storing generated code and buffers.
depth (int): The indentation level for the generated code. Default is 1.
kk: A helper function to append lines of code with proper indentation.
c, r (DefaultDict[str, int], Dict): Counters and mappings for temporary variables.
ssa: A helper function to generate single static assignment (SSA) form representations.
child_count (DefaultDict[UOp, int]): A counter for the number of children each uop has.

class tinygrad.renderer.cstyle.MetalLanguage(size_prefix: str = 'int', generic_var_prefix: str = '', kernel_prefix: str = '', buffer_prefix: str = '', buffer_suffix: str = '', smem_align: str = '', smem_prefix: str = '', smem_prefix_for_cast: bool = True, arg_int_prefix: str = '', barrier: str = '', xid: ~typing.List[str] = [], gid: ~typing.List[str] = [], lid: ~typing.List[str] = [], global_max: ~typing.List[int] = [], local_max: ~typing.List[int] = [], extra_args: ~typing.List[str] = [], float4: str | None = None, half_prekernel: str | None = None, uses_vload: bool = False, external_local_bufs: bool = False, uses_ptr_arithmetic: bool = False, launch_bounds: bool = False, code_for_op: ~typing.Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CStyleLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function CStyleLanguage.<lambda>>, TernaryOps.WHERE: <function CStyleLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>})[source]

Bases: CStyleLanguage

MetalLanguage Class: Inherits from CStyleLanguage. Defines Metal language specific attributes and methods.

Attributes:
kernel_prefix (str): Prefix for kernel functions. Default is “#include <metal_stdlib>

using namespace metal; kernel “.

buffer_prefix (str): Prefix for device buffers. Default is “device “. smem_prefix (str): Prefix for threadgroup shared memory. Default is “threadgroup “. arg_int_prefix (str): Prefix for constant integer arguments. Default is “constant int&”. barrier (str): Thread group barrier for synchronization. Default is “threadgroup_barrier(mem_flags::mem_threadgroup);”. float4 (str): Data type for four component floating point values. Default is “float4”. uses_ptr_arithmetic (bool): Indicates whether the language requires pointer arithmetic. Default is True. gid (list[str]): List of strings representing global thread IDs. Generated from range 3. lid (list[str]): List of strings representing local thread IDs. Generated from range 3. extra_args (list[str]): Additional arguments required by Metal language. Default is [“uint3 gid [[threadgroup_position_in_grid]]”, “uint3 lid [[thread_position_in_threadgroup]]”].

arg_int_prefix: str = 'constant int&'

barrier: str = 'threadgroup_barrier(mem_flags::mem_threadgroup);'

buffer_prefix: str = 'device '

extra_args: List[str] = ['uint3 gid [[threadgroup_position_in_grid]]', 'uint3 lid [[thread_position_in_threadgroup]]']

float4: str | None = 'float4'

gid: List[str] = ['gid.x', 'gid.y', 'gid.z']

kernel_prefix: str = '#include <metal_stdlib>\nusing namespace metal;\nkernel '

lid: List[str] = ['lid.x', 'lid.y', 'lid.z']

smem_prefix: str = 'threadgroup '

uses_ptr_arithmetic: bool = True

tinygrad.renderer.cstyle.MetalRenderer(function_name: str, uops: List[UOp]) → Tuple[str, Dict]

Converts a list of micro-operations (uops) to the specified C-style language.

Parameters:

lang (CStyleLanguage) – The target C-style programming language for conversion.
function_name (str) – The name of the function being converted.
uops (List[UOp]) – A list of micro-operations to convert.

Returns:

A tuple containing the converted C-style code and a dictionary.

Return type:

Tuple[str, Dict]

Attributes:

local_size (List[int]): Holds the size of local variables.
kernel, prekernel, bufs (List[]): Lists for storing generated code and buffers.
depth (int): The indentation level for the generated code. Default is 1.
kk: A helper function to append lines of code with proper indentation.
c, r (DefaultDict[str, int], Dict): Counters and mappings for temporary variables.
ssa: A helper function to generate single static assignment (SSA) form representations.
child_count (DefaultDict[UOp, int]): A counter for the number of children each uop has.

class tinygrad.renderer.cstyle.OpenCLLanguage(size_prefix: str = 'int', generic_var_prefix: str = '', kernel_prefix: str = '', buffer_prefix: str = '', buffer_suffix: str = '', smem_align: str = '', smem_prefix: str = '', smem_prefix_for_cast: bool = True, arg_int_prefix: str = '', barrier: str = '', xid: ~typing.List[str] = [], gid: ~typing.List[str] = [], lid: ~typing.List[str] = [], global_max: ~typing.List[int] = [], local_max: ~typing.List[int] = [], extra_args: ~typing.List[str] = [], float4: str | None = None, half_prekernel: str | None = None, uses_vload: bool = False, external_local_bufs: bool = False, uses_ptr_arithmetic: bool = False, launch_bounds: bool = False, code_for_op: ~typing.Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CStyleLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function CStyleLanguage.<lambda>>, TernaryOps.WHERE: <function CStyleLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>})[source]

Bases: CStyleLanguage

OpenCLLanguage class. Inherits from CStyleLanguage.

kernel_prefix

Prefix for kernel functions. Default is “__kernel “.

Type:: str

buffer_prefix

Prefix for buffer variables. Default is “__global “.

Type:: str

smem_align

Alignment attribute for shared memory. Default is “__attribute__ ((aligned (16))) “.

Type:: str

smem_prefix

Prefix for shared memory variables. Default is “__local “.

Type:: str

arg_int_prefix

Prefix for integer function arguments. Default is “const int”.

Type:: str

half_prekernel

OpenCL extension pragma for enabling half precision. Default is “#pragma OPENCL EXTENSION cl_khr_fp16 : enable”.

Type:: str

barrier

Barrier code for synchronizing threads in a work group. Default is “barrier(CLK_LOCAL_MEM_FENCE);”.

Type:: str

float4

String conversion for type float4. Default is “(float4)”.

Type:: str

gid

List of get_group_id function calls for dimensions 0, 1, and 2.

Type:: list of str

lid

List of get_local_id function calls for dimensions 0, 1, and 2.

Type:: list of str

xid

List of get_global_id function calls for dimensions 0, 1, and 2.

Type:: list of str

uses_vload

Flag indicating if vload is used. Default is True.

Type:: bool

code_for_op

Dictionary mapping operation names to lambda functions that generate code for the operations. Inherits from CStyleLanguage and adds a new entry for TernaryOps.MULACC.

Type:: dict

arg_int_prefix: str = 'const int'

barrier: str = 'barrier(CLK_LOCAL_MEM_FENCE);'

buffer_prefix: str = '__global '

code_for_op: Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CStyleLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function OpenCLLanguage.<lambda>>, TernaryOps.WHERE: <function CStyleLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>}

float4: str | None = '(float4)'

gid: List[str] = ['get_group_id(0)', 'get_group_id(1)', 'get_group_id(2)']

half_prekernel: str | None = '#pragma OPENCL EXTENSION cl_khr_fp16 : enable'

kernel_prefix: str = '__kernel '

lid: List[str] = ['get_local_id(0)', 'get_local_id(1)', 'get_local_id(2)']

smem_align: str = '__attribute__ ((aligned (16))) '

smem_prefix: str = '__local '

uses_vload: bool = True

xid: List[str] = ['get_global_id(0)', 'get_global_id(1)', 'get_global_id(2)']

tinygrad.renderer.cstyle.OpenCLRenderer(function_name: str, uops: List[UOp]) → Tuple[str, Dict]

Converts a list of micro-operations (uops) to the specified C-style language.

Parameters:

lang (CStyleLanguage) – The target C-style programming language for conversion.
function_name (str) – The name of the function being converted.
uops (List[UOp]) – A list of micro-operations to convert.

Returns:

A tuple containing the converted C-style code and a dictionary.

Return type:

Tuple[str, Dict]

Attributes:

local_size (List[int]): Holds the size of local variables.
kernel, prekernel, bufs (List[]): Lists for storing generated code and buffers.
depth (int): The indentation level for the generated code. Default is 1.
kk: A helper function to append lines of code with proper indentation.
c, r (DefaultDict[str, int], Dict): Counters and mappings for temporary variables.
ssa: A helper function to generate single static assignment (SSA) form representations.
child_count (DefaultDict[UOp, int]): A counter for the number of children each uop has.

class tinygrad.renderer.cstyle.WGSLLanguage(size_prefix: str = 'int', generic_var_prefix: str = '', kernel_prefix: str = '', buffer_prefix: str = '', buffer_suffix: str = '', smem_align: str = '', smem_prefix: str = '', smem_prefix_for_cast: bool = True, arg_int_prefix: str = '', barrier: str = '', xid: ~typing.List[str] = [], gid: ~typing.List[str] = [], lid: ~typing.List[str] = [], global_max: ~typing.List[int] = [], local_max: ~typing.List[int] = [], extra_args: ~typing.List[str] = [], float4: str | None = None, half_prekernel: str | None = None, uses_vload: bool = False, external_local_bufs: bool = False, uses_ptr_arithmetic: bool = False, launch_bounds: bool = False, code_for_op: ~typing.Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function CStyleLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CStyleLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function CStyleLanguage.<lambda>>, TernaryOps.WHERE: <function CStyleLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>})[source]

Bases: CStyleLanguage

The WGSLLanguage class, a subclass of CStyleLanguage. This class is used to represent the WebGPU Shading Language (WGSL).

gid

List containing gid indices for x, y, and z dimensions.

Type:: list

lid

List containing lid indices for x, y, and z dimensions.

Type:: list

size_prefix

Prefix used to declare a variable. Defaults to “let”.

Type:: str

barrier

Code snippet for workgroup barrier. Defaults to “workgroupBarrier();”.

Type:: str

generic_var_prefix

Prefix used for generic variables. Defaults to “var “.

Type:: str

external_local_bufs

Flag indicating if local buffers are external. Defaults to True.

Type:: bool

code_for_op

Dictionary containing custom op codes for various operations like BinaryOps and TernaryOps.

Type:: dict

type_map

Dictionary mapping dtypes to WGSL types.

Type:: dict

barrier: str = 'workgroupBarrier();'

code_for_op: Dict = {BinaryOps.ADD: <function CStyleLanguage.<lambda>>, BinaryOps.CMPLT: <function WGSLLanguage.<lambda>>, BinaryOps.DIV: <function CStyleLanguage.<lambda>>, BinaryOps.MAX: <function CStyleLanguage.<lambda>>, BinaryOps.MOD: <function CStyleLanguage.<lambda>>, BinaryOps.MUL: <function CStyleLanguage.<lambda>>, BinaryOps.SUB: <function CStyleLanguage.<lambda>>, TernaryOps.MULACC: <function WGSLLanguage.<lambda>>, TernaryOps.WHERE: <function WGSLLanguage.<lambda>>, UnaryOps.EXP2: <function CStyleLanguage.<lambda>>, UnaryOps.LOG2: <function CStyleLanguage.<lambda>>, UnaryOps.NEG: <function CStyleLanguage.<lambda>>, UnaryOps.SIN: <function CStyleLanguage.<lambda>>, UnaryOps.SQRT: <function CStyleLanguage.<lambda>>}

external_local_bufs: bool = True

generic_var_prefix: str = 'var '

gid: List[str] = ['i32(gindex.x)', 'i32(gindex.y)', 'i32(gindex.z)']

lid: List[str] = ['i32(lindex.x)', 'i32(lindex.y)', 'i32(lindex.z)']

render_cast(x: List[str], var_dtype: DType) → str[source]

Render a type cast for the given value with the target data type.

x

The value to be casted.

Type:: List[str]

var_dtype

The target data type.

Type:: DType

Returns:: The rendered type cast expression as a string.
Return type:: str
Raises:: NotImplementedError – If no cast is available for the target data type.

render_conditional(cond: str, x: str, y: str) → str[source]

Render a conditional expression that selects between two values based on a condition.

cond

The conditional expression to be checked.

Type:: str

x

The value to select if the condition is true.

Type:: str

y

The value to select if the condition is false.

Type:: str

Returns:: The rendered conditional expression as a string.
Return type:: str

render_const(x: float | int, var_dtype) → str[source]

Render a constant value.

Parameters:

x (Union[float, int]) – The constant value to be rendered.
var_dtype – The data type of the variable.

Returns:

The WGSL code snippet for the constant value.

Return type:

str

render_for(expr: str, _min: int | str, _max: int | str) → str[source]

Render a for loop with the given expression and range.

expr

The loop variable’s name.

Type:: str

_min

The starting value of the loop variable.

Type:: Union[int, str]

_max

The ending value of the loop variable.

Type:: Union[int, str]

Returns:: The rendered for loop as a string.
Return type:: str

render_if(cond: str)[source]

Render an if statement with the given condition.

cond

The conditional expression to be checked in the if statement.

Type:: str

Returns:: The rendered if statement as a string.
Return type:: str

render_kernel(function_name: str, kernel: List[str], bufs: List[Tuple[str, DType]], local_size: List[int], prekernel: List[str]) → str[source]

Render the kernel for execution.

function_name

The name of the function to be rendered.

Type:: str

kernel

The list of kernel code lines.

Type:: List[str]

bufs

A list of tuples containing buffer names and their respective data types.

Type:: List[Tuple[str, DType]]

local_size

The local size for workgroup execution. If not provided, default is [1].

Type:: List[int]

prekernel

Code lines to be executed before the kernel code.

Type:: List[str]

Returns:: The rendered kernel code as a string.
Return type:: str

render_local(name: str, size: int)[source]

Render a local variable declaration.

Parameters:

name (str) – The name of the variable to be declared.
size (int) – The size of the array.

Returns:

The WGSL code snippet for declaring a local variable.

Return type:

str

render_store(buf_name: str, buf_dtype: DType, var_name: str, var_dtype: DType, idx, local=False) → str[source]

Render a store operation that stores a value in a buffer at the given index.

buf_name

The name of the buffer.

Type:: str

buf_dtype

The data type of the buffer.

Type:: DType

var_name

The name of the variable to store.

Type:: str

var_dtype

The data type of the variable.

Type:: DType

idx: The index at which to store the value in the buffer.

local

Whether this is a local store operation. Defaults to False.

Type:: bool, optional

Returns:: The rendered store operation as a string.
Return type:: str

size_prefix: str = 'let'

type_map = {(0, 1, 'bool', <class 'numpy.bool_'>, 1): 'bool', (5, 4, 'int', <class 'numpy.int32'>, 1): 'i32', (6, 4, 'unsigned int', <class 'numpy.uint32'>, 1): 'u32', (9, 2, 'half', <class 'numpy.float16'>, 1): 'f16', (10, 4, 'float', <class 'numpy.float32'>, 1): 'f32'}

tinygrad.renderer.cstyle.WGSLRenderer(function_name: str, uops: List[UOp]) → Tuple[str, Dict]

Converts a list of micro-operations (uops) to the specified C-style language.

Parameters:

lang (CStyleLanguage) – The target C-style programming language for conversion.
function_name (str) – The name of the function being converted.
uops (List[UOp]) – A list of micro-operations to convert.

Returns:

A tuple containing the converted C-style code and a dictionary.

Return type:

Tuple[str, Dict]

Attributes:

local_size (List[int]): Holds the size of local variables.
kernel, prekernel, bufs (List[]): Lists for storing generated code and buffers.
depth (int): The indentation level for the generated code. Default is 1.
kk: A helper function to append lines of code with proper indentation.
c, r (DefaultDict[str, int], Dict): Counters and mappings for temporary variables.
ssa: A helper function to generate single static assignment (SSA) form representations.
child_count (DefaultDict[UOp, int]): A counter for the number of children each uop has.

tinygrad.renderer.cstyle.uops_to_cstyle(lang: CStyleLanguage, function_name: str, uops: List[UOp]) → Tuple[str, Dict][source]

Converts a list of micro-operations (uops) to the specified C-style language.

Parameters:

lang (CStyleLanguage) – The target C-style programming language for conversion.
function_name (str) – The name of the function being converted.
uops (List[UOp]) – A list of micro-operations to convert.

Returns:

A tuple containing the converted C-style code and a dictionary.

Return type:

Tuple[str, Dict]

Attributes:

local_size (List[int]): Holds the size of local variables.
kernel, prekernel, bufs (List[]): Lists for storing generated code and buffers.
depth (int): The indentation level for the generated code. Default is 1.
kk: A helper function to append lines of code with proper indentation.
c, r (DefaultDict[str, int], Dict): Counters and mappings for temporary variables.
ssa: A helper function to generate single static assignment (SSA) form representations.
child_count (DefaultDict[UOp, int]): A counter for the number of children each uop has.