Uploaded image for project: 'Zapped: AcousticBrainz'
  1. Zapped: AcousticBrainz
  2. AB-125

BigQuery summary data schema

XMLWordPrintable

    • Icon: New Feature New Feature
    • Resolution: Won't Do
    • Icon: Normal Normal
    • None
    • None
    • None
    • None

      The easiest way to play with the data was to import it as strings representing the json data from Postgres. This is quick, but it means you have to parse the whole document to get an item out of it and you use up your quota much more quickly. This means the first thing that we should do before transferring data ti BigQuery is develop a schema which represents our lowlevel documents so we can query just a specific field, and the data usage should be pretty small.

      The steps should probably involve:
      1. Create empty BQ table with "bq mk"
      2. Update schema with "bq -t "bq update -t <schema>"
      3. Load summarized data already in Google Cloud Storage to BigQuery
      4. Test against a couple of queries
      5. Repeat until queries are optimized
      6. Proceed to #126

            Unassigned Unassigned
            gcetusic Goran Cetusic
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:

                Version Package