Try something like this in GBQ

[image: image.png]

-- data insertion
INSERT INTO `test.michboy` (user_id, feature_name, feature_value,
as_of_date, effective_from, effective_to)
SELECT
    user_id,
    'last_purchase_product',
    last_purchase_product,
    CURRENT_DATE() as_of_date,
    CURRENT_DATE() effective_from,
    DATE('2099-12-31') effective_to
FROM
    your_daily_user_table
WHERE
    date = CURRENT_DATE();

-- data update
UPDATE `test.michboy`
SET
    feature_value = 'new_product',
    effective_to = CURRENT_DATE() - 1
WHERE
    user_id = 1
    AND feature_name = 'last_purchase_product'
    AND effective_to = DATE('2099-12-31');

-- Example of data retrieval
SELECT
    feature_value
FROM
    `test.michboy`
WHERE
    user_id = 1
    AND feature_name = 'last_purchase_product'
    AND as_of_date = '2024-01-01'
    AND '2024-01-01' BETWEEN effective_from AND effective_to;

use as_of_date and the effective_from and effective_to range to retrieve
the correct feature value for a given date.

HTH

Mich Talebzadeh,

Architect | Data Science | Financial Crime | Forensic Analysis | GDPR

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>





On Thu, 23 Jan 2025 at 19:42, Gunjan Kumar <gunjani...@gmail.com> wrote:

>
> What is the best way to implement a feature store in bigquery so that I
> can get features as of any date in past one year for a given user.
>
> Currently we have designed our bigquery table partitioned by daily date
> and we update the partition with the snaphopt of all users and it's
> features for that day. But this kind of data model has lot of duplicate
> data which is not required.
>
> How we can design this table in such a way that I don't loose the ability
> to get feature as of any date, but also avoid lot of duplicates.
>
> I want to have below functionality.
>
> Select last_purchase_product where user_id = 1 and as_of_date =
> '2024-01-01'
>
>

Reply via email to