Changefeeds on Tables with Column Families

On this page Carat arrow pointing down

You can create changefeeds on tables with more than one column family. Changefeeds will emit individual messages per column family on a table.

For further detail, see the following sections:

Syntax

To target a table with multiple column families, set the split_column_families option when creating a changefeed:

CREATE CHANGEFEED FOR TABLE {table} INTO {sink} WITH split_column_families;

To emit messages for a specific column family, use the FAMILY keyword:

CREATE CHANGEFEED FOR TABLE {table} FAMILY {family} INTO {sink};
Note:

You can also use Core changefeeds on tables with column families by using the EXPERIMENTAL CHANGEFEED FOR statement with split_column_families or the FAMILY keyword.

If a table has multiple column families, the FAMILY keyword will ensure the changefeed emits messages for each column family you define with FAMILY in the CREATE CHANGEFEED statement. If you do not specify FAMILY, then the changefeed will emit messages for all the table's column families.

To specify multiple families on the same table, it is necessary to define the table and family in both instances:

CREATE CHANGEFEED FOR TABLE tbl FAMILY f_1, TABLE tbl FAMILY f_2;

Message format

The response will follow a typical changefeed message format, but with the family name appended to the table name with a ., in the format table.family:

{"after":{"column":"value"},"key":[1],"topic":"table.family"}

For cloud storage sinks, the filename will include the family name appended to the table name with a +, in the format table+primary.

Avro schema names will include the family name concatenated to the table name.

The primary key columns will appear in the key for all column families, and will also appear in the value only for the families that they are a member of.

For example, if the table office_dogs has a column family primary, containing the primary key and a STRING column, and a secondary column family containing a different STRING column, then you'll receive two messages for an insert.

CREATE TABLE office_dogs (
   id INT PRIMARY KEY,
   name STRING,
   owner STRING,
   FAMILY primary (id, name),
   FAMILY secondary (owner)
 );

The changefeed targeting this table (started with split_column_families) will emit the following when there are inserts to the table:

{"after":{"id":4,"name":"Toby"},"key":[4],"topic":"office_dogs.primary"}],"length":1}
{"after":{"owner":"Ashley"},"key":[4],"topic":"office_dogs.secondary"}],"length":1}

The output shows the primary column family with 4 in the value ({"id":4,"name":"Toby"}) and the key ("key":[4]). The secondary family doesn't contain the id column, so the primary key 4 is only in the key and not the value. For an update that only affects data in one column family, the changefeed will send one message for that update relating to the family.

Considerations

  • If you create a table without column families and then start a changefeed with the split_column_families option, it is not possible to add column families. A subsequent ALTER TABLE statement adding a column family to the table will cause the changefeed to fail.
  • When you do not specify column family names in the CREATE or ALTER TABLE statement, the family names will default to either of the following:
    • primary: Since primary is a key word, you'll receive a syntax error if you run CREATE CHANGEFEED FOR table FAMILY primary. To avoid this syntax error, use double quotes: CREATE CHANGEFEED FOR table FAMILY "primary". You'll receive output from the changefeed like: table.primary.
    • fam_<zero-indexed family id>_<delimited list of columns>: For a table that does not include a name for the family: FAMILY (id, name), you'll receive output from the changefeed containing: table.fam_0_id_name. This references the table, the family ID and the two columns that this column family includes.
  • Creating a changefeed with CDC queries is not supported on tables with more than one column family.
  • When you create a changefeed on a table with more than one column family, the changefeed will emit messages per column family in separate streams. As a result, changefeed messages for different column families will arrive at the sink under separate topics. For more details, refer to Message format.

For examples of starting changefeeds on tables with column families, see the following examples for Enterprise and Core changefeeds.

Create a changefeed on a table with column families

In this example, you'll set up changefeeds on two tables that have column families. You'll use a single-node cluster sending changes to a webhook sink for this example, but you can use any changefeed sink to work with tables that include column families.

  1. If you do not already have one, request a trial Enterprise license.

  2. Use the cockroach start-single-node command to start a single-node cluster:

    icon/buttons/copy
    cockroach start-single-node --insecure --listen-addr=localhost
    
  3. As the root user, open the built-in SQL client:

    icon/buttons/copy
    cockroach sql --insecure
    
  4. Set your organization and license key:

    icon/buttons/copy
    SET CLUSTER SETTING cluster.organization = '<organization name>';
    
    icon/buttons/copy
    SET CLUSTER SETTING enterprise.license = '<secret>';
    
  5. Enable the kv.rangefeed.enabled cluster setting:

    icon/buttons/copy
    SET CLUSTER SETTING kv.rangefeed.enabled = true;
    
  6. In a separate terminal window, set up your HTTP server. Clone the test repository:

    icon/buttons/copy
    git clone https://github.com/cockroachlabs/cdc-webhook-sink-test-server.git
    
    icon/buttons/copy
    cd cdc-webhook-sink-test-server/go-https-server
    
  7. Next make the script executable and then run the server (passing a specific port if preferred, otherwise it will default to :3000):

    icon/buttons/copy
    chmod +x ./server.sh
    
    icon/buttons/copy
    ./server.sh <port>
    
  8. Back in your SQL shell, create a database called cdc_demo:

    icon/buttons/copy
    CREATE DATABASE cdc_demo;
    
  9. Set the database as the default:

    icon/buttons/copy
    USE cdc_demo;
    
  10. Create a table with two column families:

    icon/buttons/copy
    CREATE TABLE office_dogs (
        id INT PRIMARY KEY,
        name STRING,
        dog_owner STRING,
        FAMILY dogs (id, name),
        FAMILY employee (dog_owner)
      );
    
  11. Insert some data into the table:

    icon/buttons/copy
    INSERT INTO office_dogs (id, name, dog_owner) VALUES (1, 'Petee', 'Lauren'), (2, 'Max', 'Taylor'), (3, 'Patch', 'Sammy'), (4, 'Roach', 'Ashley');
    
  12. Create a second table that also defines column families:

    icon/buttons/copy
    CREATE TABLE office_plants (
         id INT PRIMARY KEY,
         plant_name STRING,
         office_floor INT,
         safe_for_dogs BOOL,
         FAMILY dog_friendly (office_floor, safe_for_dogs),
         FAMILY plant (id, plant_name)
       );
    
  13. Insert some data into office_plants:

    icon/buttons/copy
    INSERT INTO office_plants (id, plant_name, office_floor, safe_for_dogs) VALUES (1, 'Sansevieria', 11, false), (2, 'Monstera', 11, false), (3, 'Peperomia', 10, true), (4, 'Jade', 9, true);
    
  14. Create a changefeed on the office_dogs table targeting one of the column families. Use the FAMILY keyword in the CREATE statement:

    icon/buttons/copy
    CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true';
    

    You'll receive one message for each of the inserts that affects the specified column family:

    {"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1}
    
    Note:

    The ordering of messages is not guaranteed. That is, you may not always receive messages for the same row, or even the same change to the same row, next to each other.

    Alternatively, create a changefeed using the FAMILY keyword across two tables:

    icon/buttons/copy
    CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_plants FAMILY dog_friendly INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true';
    

    You'll receive one message for each insert that affects the specified column families:

    {"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"office_floor":11,"safe_for_dogs":false},"key":[1],"topic":"office_plants.dog_friendly"}],"length":1}
    {"payload":[{"after":{"office_floor":9,"safe_for_dogs":true},"key":[4],"topic":"office_plants.dog_friendly"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"office_floor":11,"safe_for_dogs":false},"key":[2],"topic":"office_plants.dog_friendly"}],"length":1}
    {"payload":[{"after":{"office_floor":10,"safe_for_dogs":true},"key":[3],"topic":"office_plants.dog_friendly"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1}
    

    This allows you to define particular column families for the changefeed to target, without necessarily specifying every family in a table.

    Note:

    To create a changefeed specifying two families on one table, ensure that you define the table and family in both instances:

    CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_dogs FAMILY dogs INTO {sink};

  15. To create a changefeed that emits messages for all column families in a table, use the split_column_families option:

    icon/buttons/copy
    CREATE CHANGEFEED FOR TABLE office_dogs INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true' with split_column_families;
    

    You'll receive output for both of the column families in the office_dogs table:

    {"payload":[{"after":{"id":1,"name":"Petee"},"key":[1],"topic":"office_dogs.dogs"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"id":2,"name":"Max"},"key":[2],"topic":"office_dogs.dogs"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"id":3,"name":"Patch"},"key":[3],"topic":"office_dogs.dogs"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1}
    {"payload":[{"after":{"id":4,"name":"Roach"},"key":[4],"topic":"office_dogs.dogs"}],"length":1}
    {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1}
    
    Note:

    You can find details of your changefeed job using SHOW CHANGEFEED JOBS. Changefeeds streaming to Kafka or Google Cloud Pub/Sub will populate the topics field in the SHOW CHANGEFEED JOBS output.

    When using the FAMILY keyword, the topics field will display in the format topic.family, e.g., office_dogs.employee,office_dogs.dogs. With the split_column_families option set, topics will show the topic name and a family placeholder topic.{family}, e.g., office_dogs.{family}.

  16. Update one of the values in the table:

    icon/buttons/copy
    UPDATE office_dogs SET name = 'Izzy' WHERE id = 4;
    

    This only affects one column family, which means you'll receive one message:

    {"payload":[{"after":{"id":4,"name":"Izzy"},"key":[4],"topic":"office_dogs.dogs"}],"length":1}
    

Create a Core changefeed on a table with column families

In this example, you'll set up Core changefeeds on two tables that have column families. You'll use a single-node cluster with the Core changefeed sending changes to the client.

  1. Use the cockroach start-single-node command to start a single-node cluster:

    icon/buttons/copy
    cockroach start-single-node --insecure --listen-addr=localhost
    
  2. As the root user, open the built-in SQL client:

    icon/buttons/copy
    cockroach sql --url="postgresql://root@127.0.0.1:26257?sslmode=disable" --format=csv
    
  3. Enable the kv.rangefeed.enabled cluster setting:

    icon/buttons/copy
    SET CLUSTER SETTING kv.rangefeed.enabled = true;
    
  4. Create a database called cdc_demo:

    icon/buttons/copy
    CREATE DATABASE cdc_demo;
    
  5. Set the database as the default:

    icon/buttons/copy
    USE cdc_demo;
    
  6. Create a table with two column families:

    icon/buttons/copy
    CREATE TABLE office_dogs (
          id INT PRIMARY KEY,
          name STRING,
          dog_owner STRING,
          FAMILY dogs (id, name),
          FAMILY employee (dog_owner)
        );
    
  7. Insert some data into the table:

    icon/buttons/copy
    INSERT INTO office_dogs (id, name, dog_owner) VALUES (1, 'Petee', 'Lauren'), (2, 'Max', 'Taylor'), (3, 'Patch', 'Sammy'), (4, 'Roach', 'Ashley');
    
  8. Create another table that also defines two column families:

    icon/buttons/copy
    CREATE TABLE office_plants (
         id INT PRIMARY KEY,
         plant_name STRING,
         office_floor INT,
         safe_for_dogs BOOL,
         FAMILY dog_friendly (office_floor, safe_for_dogs),
         FAMILY plant (id, plant_name)
       );
    
  9. Insert some data into office_plants:

    icon/buttons/copy
    INSERT INTO office_plants (id, plant_name, office_floor, safe_for_dogs) VALUES (1, 'Sansevieria', 11, false), (2, 'Monstera', 11, false), (3, 'Peperomia', 10, true), (4, 'Jade', 9, true);
    
  10. Create a changefeed on the office_dogs table targeting one of the column families. Use the FAMILY keyword in the statement:

    icon/buttons/copy
    EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee;
    

    You'll receive one message for each of the inserts that affects the specified column family:

    table,key,value
    office_dogs.employee,[1],"{""after"": {""owner"": ""Lauren""}}"
    office_dogs.employee,[2],"{""after"": {""owner"": ""Taylor""}}"
    office_dogs.employee,[3],"{""after"": {""owner"": ""Sammy""}}"
    office_dogs.employee,[4],"{""after"": {""owner"": ""Ashley""}}"
    
    Note:

    The ordering of messages is not guaranteed. That is, you may not always receive messages for the same row, or even the same change to the same row, next to each other.

    Alternatively, create a changefeed using the FAMILY keyword across two tables:

    icon/buttons/copy
    EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_plants FAMILY dog_friendly;
    

    You'll receive one message for each insert that affects the specified column families:

    table,key,value
    office_plants.dog_friendly,[1],"{""after"": {""office_floor"": 11, ""safe_for_dogs"": false}}"
    office_plants.dog_friendly,[2],"{""after"": {""office_floor"": 11, ""safe_for_dogs"": false}}"
    office_plants.dog_friendly,[3],"{""after"": {""office_floor"": 10, ""safe_for_dogs"": true}}"
    office_plants.dog_friendly,[4],"{""after"": {""office_floor"": 9, ""safe_for_dogs"": true}}"
    office_dogs.employee,[1],"{""after"": {""dog_owner"": ""Lauren""}}"
    office_dogs.employee,[2],"{""after"": {""dog_owner"": ""Taylor""}}"
    office_dogs.employee,[3],"{""after"": {""dog_owner"": ""Sammy""}}"
    office_dogs.employee,[4],"{""after"": {""dog_owner"": ""Ashley""}}"
    

    This allows you to define particular column families for the changefeed to target, without necessarily specifying every family in a table.

    Note:

    To create a changefeed specifying two families on one table, ensure that you define the table and family in both instances:

    EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_dogs FAMILY dogs;

  11. To create a changefeed that emits messages for all column families in a table, use the split_column_families option:

    icon/buttons/copy
    EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs WITH split_column_families;
    

    In your other terminal window, insert some more values:

    icon/buttons/copy
    cockroach sql --insecure -e "INSERT INTO cdc_demo.office_dogs (id, name, dog_owner) VALUES (5, 'Daisy', 'Cameron'), (6, 'Sage', 'Blair'), (7, 'Bella', 'Ellis');"
    

    Your changefeed will output the following:

    table,key,value
    office_dogs.dogs,[1],"{""after"": {""id"": 1, ""name"": ""Petee""}}"
    office_dogs.employee,[1],"{""after"": {""owner"": ""Lauren""}}"
    office_dogs.dogs,[2],"{""after"": {""id"": 2, ""name"": ""Max""}}"
    office_dogs.employee,[2],"{""after"": {""owner"": ""Taylor""}}"
    office_dogs.dogs,[3],"{""after"": {""id"": 3, ""name"": ""Patch""}}"
    office_dogs.employee,[3],"{""after"": {""owner"": ""Sammy""}}"
    office_dogs.dogs,[4],"{""after"": {""id"": 4, ""name"": ""Roach""}}"
    office_dogs.employee,[4],"{""after"": {""owner"": ""Ashley""}}"
    office_dogs.dogs,[5],"{""after"": {""id"": 5, ""name"": ""Daisy""}}"
    office_dogs.employee,[5],"{""after"": {""owner"": ""Cameron""}}"
    office_dogs.dogs,[6],"{""after"": {""id"": 6, ""name"": ""Sage""}}"
    office_dogs.employee,[6],"{""after"": {""owner"": ""Blair""}}"
    office_dogs.dogs,[7],"{""after"": {""id"": 7, ""name"": ""Bella""}}"
    office_dogs.employee,[7],"{""after"": {""owner"": ""Ellis""}}"
    
  12. In your other terminal window, update one of the values in the table:

    icon/buttons/copy
    cockroach sql --insecure -e "UPDATE cdc_demo.office_dogs SET name = 'Izzy' WHERE id = 4;"
    

    This only affects one column family, which means you'll receive one message:

    office_dogs.dogs,[4],"{""after"": {""id"": 4, ""name"": ""Izzy""}}"
    

See also


Yes No
On this page

Yes No