Defining Dimensions

Dimensions transform raw data columns into categorical labels with controlled ordering. Define them in your Malloy source using .extend().

Why Define Dimensions?

Transform raw codes into readable labels - Convert educ=12 into "High School"
Control sort order - Values appear in definition order, not alphabetically
Group values into categories - Combine multiple codes into logical groups
Handle NULL values - Explicitly control what happens to unmatched values

Definition Order

The order you define buckets determines their sort order in tables. For education levels, define them in logical order (<HS, HS, College) rather than alphabetical (College, HS, <HS).

Basic Syntax

Dimensions are defined using Malloy's pick expression within an extend block:

typescript

const tpl = fromCSV('data.csv').extend(`
  dimension:
    dimension_name is
      pick 'Label 1' when condition1
      pick 'Label 2' when condition2
      else default_value
`)

Examples

Simple Alias (Passthrough)

When the source column already contains the values you want:

typescript

const tpl = fromCSV('employees.csv').extend(`
  dimension: gender is gendchar
`)

This creates a dimension that passes through the values from gendchar unchanged.

Bucketed Dimension

Transform numeric codes into readable labels:

typescript

const tpl = fromCSV('employees.csv').extend(`
  dimension:
    education is
      pick '<HS' when educ < 12
      pick 'HS' when educ = 12
      pick 'College' when educ >= 13
      else null
`)

Values appear in definition order: <HS, HS, College - not alphabetically.

Education dimension (definition order)

TABLE ROWS education * income.sum COLS gender;

Dataset:samplesoccupation, education, gender, income

Compound Conditions

Use and and or to combine conditions:

typescript

const tpl = fromCSV('employees.csv').extend(`
  dimension:
    sector_label is
      pick 'Private' when sector = 1 or sector = 5 or sector = 6
      pick 'Public' when sector >= 2 and sector <= 4
      else null

    marital_status is
      pick 'Married' when marital >= 1 and marital <= 3
      pick 'Widowed' when marital = 4
      pick 'Divorced/Sep.' when marital = 5 or marital = 6
      pick 'Never Married' when marital = 7
      else null
`)

Detailed Bucketing

Create fine-grained categories:

typescript

const tpl = fromCSV('employees.csv').extend(`
  dimension:
    education_detail is
      pick '<HS' when educ < 12
      pick 'HS graduate' when educ = 12
      pick 'Some College' when educ >= 13 and educ <= 15
      pick 'College Grad' when educ = 16
      pick 'Some Graduate' when educ >= 17
      else null
`)

Detailed education (5 levels)

TABLE ROWS education_detail * income.sum COLS gender;

Dataset:samplesoccupation, education, gender, income

Condition Operators

Operator	Example	Meaning
`=`	`educ = 12`	Equal to
`<`	`educ < 12`	Less than
`>`	`educ > 12`	Greater than
`<=`	`educ <= 12`	Less than or equal
`>=`	`educ >= 13`	Greater than or equal
`and`	`educ >= 1 and educ <= 3`	Both conditions must be true
`or`	`educ = 1 or educ = 2`	Either condition can be true

The ELSE Clause

The else clause specifies what happens to values that don't match any pick condition:

typescript

// Exclude unmatched values (they become NULL and are filtered out)
const tpl = fromCSV('data.csv').extend(`
  dimension:
    education is
      pick '<HS' when educ < 12
      pick 'HS' when educ = 12
      pick 'College' when educ >= 13
      else null
`)

// Include unmatched values with a label
const tpl = fromCSV('data.csv').extend(`
  dimension:
    education is
      pick '<HS' when educ < 12
      pick 'HS' when educ = 12
      pick 'College' when educ >= 13
      else 'Unknown'
`)

WARNING

If you omit else, unmatched values will appear with their raw value, which may not be what you want.

Definition Order

This is a key feature. The order you define buckets controls how values sort in tables. When you define buckets using increasing numeric codes, the table will show values in that same order:

typescript

const tpl = fromCSV('employees.csv').extend(`
  dimension:
    // Values will appear in this order: Managerial, Professional, Technical, ...
    occupation is
      pick 'Managerial' when occup = 1
      pick 'Professional' when occup = 2
      pick 'Technical' when occup = 3
      pick 'Sales' when occup = 4
      pick 'Clerical' when occup = 5
      pick 'Services' when occup >= 6 and occup <= 8
      pick 'Manufacturing' when occup = 9 or occup = 10
      pick 'Transport' when occup = 11 or occup = 12
      pick 'Farming' when occup = 13 or occup = 14
      else null

    // Tell TPL to use the underlying numeric column for ordering
    occupation_order is occup
`)

The occupation_order is occup line tells TPL to sort by the underlying numeric codes rather than alphabetically. This ensures:

Default sort order - Tables show values in definition order
Limit behavior - [5] gives first 5 by definition order, not alphabetically
Implicit ordering - No need for @aggregate to get meaningful order

First 5 occupations (by definition order)

TABLE ROWS occupation[5] * income.sum COLS gender;

Dataset:samplesoccupation, education, gender, income

Try Variations

Multiple Dimensions

Define as many dimensions as needed. Each is independent:

typescript

const tpl = fromCSV('employees.csv').extend(`
  dimension:
    education is
      pick '<HS' when educ < 12
      pick 'HS' when educ = 12
      pick 'College' when educ >= 13
      else null
    education_order is educ

    gender is gendchar

    occupation is
      pick 'Managerial' when occup = 1
      pick 'Professional' when occup = 2
      // ... more buckets
      else null
    occupation_order is occup
`)

// Now use them in queries
const { html } = await tpl.query('TABLE ROWS occupation * gender COLS education * income.sum;')

Dimensions and Percentiles

Dimensions work seamlessly with percentile aggregations. The percentile is computed correctly within each dimension bucket:

Percentiles by education level

TABLE ROWS education * income.(p25 | p50 | p75) COLS gender;

Dataset:samplesoccupation, education, gender, income

Best Practices

Define buckets in logical order - Not alphabetical, but the order that makes sense for your data (e.g., education levels from lowest to highest)
Use else null for clean data - This filters out unexpected values rather than showing raw codes
Keep bucket labels concise - They become table headers, so shorter is better
Consider multiple granularities - Define both education (3 levels) and education_detail (5 levels) for different views
Add ordering dimensions - Include <dim>_order is <raw_column> to enable definition-order sorting

Alphabetic Limit - How definition order affects [N] limits
Order by Code Column - Ordering by underlying numeric codes
Data Model - Complete reference of playground dimensions

Defining Dimensions ​

Why Define Dimensions? ​

Basic Syntax ​

Examples ​

Simple Alias (Passthrough) ​

Bucketed Dimension ​

Compound Conditions ​

Detailed Bucketing ​

Condition Operators ​

The ELSE Clause ​

Definition Order ​

Multiple Dimensions ​

Dimensions and Percentiles ​

Best Practices ​

Related ​

Defining Dimensions

Why Define Dimensions?

Basic Syntax

Examples

Simple Alias (Passthrough)

Bucketed Dimension

Compound Conditions

Detailed Bucketing

Condition Operators

The ELSE Clause

Definition Order

Multiple Dimensions

Dimensions and Percentiles

Best Practices

Related