Quickstart

This section is mainly intended for developers who are already accustomed to fundamentals of Python, as well as its common ML libraries and frameworks. If you're a beginner in ML Development, we recommend checking the Tutorials first.

We assume you have installed the giza-datasets library in your preferred environment, if not, check the installation guide.


  1. Import giza-datasets

from giza_datasets import DatasetsHub, DatasetsLoader

Additionally, it might be required to run the following lines. See DatasetsLoader.

import os
import certifi

os.environ['SSL_CERT_FILE'] = certifi.where()

  1. Query the datasets using a DatasetsHub object

hub = DatasetsHub()

With the DatasetsHub() object, we can know query the DatasetsHub to find the perfect dataset for our ML model. See DatasetsHub for further instructions. Alternatively, you can check DatasetsHub pages to explore the available datasets from your browser.

Lets use the list_tags() function to list all the tags and then get_by_tag() to query all the datasets with the "Yearn-v2" tag.

print(hub.list_tags())

[ 'Trade Volume', 'DeFi', 'Yearn-v2','Interest Rates','compound-v2',....]

Yearn-v2 looks interesting, lets search all the datasets that have the 'Yearn-v2' tag.

datasets = hub.get_by_tag('Yearn-v2')

for dataset in datasets:
    hub.describe(dataset.name)
                        Details for yearn-individual-deposits                        
ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”³ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”“
ā”ƒ Attribute     ā”ƒ Value                                                             ā”ƒ
ā””ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā•‡ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”ā”©
ā”‚ Path          ā”‚ gs://datasets-giza/Yearn/Yearn_Individual_Deposits.parquet        ā”‚
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
ā”‚ Description   ā”‚ Individual Yearn Vault deposits                                   ā”‚
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
ā”‚ Tags          ā”‚ DeFi, Yield, Yearn-v2, Ethereum, Deposits                         ā”‚
ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
ā”‚ Documentation ā”‚ https://datasets.gizatech.xyz/hub/yearn/individual-vault-deposits ā”‚
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

yearn-individual-deposits looks great!


  1. Load a dataset using DatasetLoader

loader = DatasetsLoader()

Having instantiated the DatasetsLoader(), all we need to do is load the dataset using the name we have queried using DatasetsHub().

df = loader.load('yearn-individual-deposits')

df.head()

shape: (5, 7)

evt_block_timeevt_block_numbervaultstoken_contract_addresstoken_symboltoken_decimalsvalue

datetime[ns]

i64

str

str

str

i64

f64

2023-06-07 09:50:35

17427717

"0x3b27f92c0e21ā€¦

"0xdac17f958d2eā€¦

"USDT"

6

14174.301085

2022-08-25 13:53:28

15409462

"0x3b27f92c0e21ā€¦

"0xdac17f958d2eā€¦

"USDT"

6

38.046614

2022-08-25 07:13:02

15407745

"0x3b27f92c0e21ā€¦

"0xdac17f958d2eā€¦

"USDT"

6

4620.369198

2022-11-19 03:41:35

16001443

"0x3b27f92c0e21ā€¦

"0xdac17f958d2eā€¦

"USDT"

6

969.687071

2022-12-30 18:34:11

16299403

"0x3b27f92c0e21ā€¦

"0xdac17f958d2eā€¦

"USDT"

6

56.270566

Keep in mind that giza-datasets uses Polars (and not Pandas) as the underlying DataFrame library.


Perfect, the Dataset is loaded correctly and ready to go! Now we can use our preferred ML Framework and start building.

Last updated