Working with Combined Knowledge Sorts inside a Discipline Utilizing Rockset

0/5 No votes

Report this app

Description

[ad_1]

So… you assume all of your information in a specific subject are a string sort, however whenever you attempt to run your question, you get some errors. Doing extra investigation, it appears to be like like you’ve gotten some int and undefined varieties as effectively. Bummer…


panda crying

Despair not! We will truly work round this (with out information prep 😉). To recap, in our first weblog, we created an integration with MongoDB on Rockset, so Rockset can learn and [update] the information coming in MongoDB. As soon as the information is in Rockset, we will run SQL on schemaless and unstructured information.

The info:

Embedded content material: https://gist.github.com/nfarah86/ef1cc9da88e56226c4c46fd0e3c8e16e

We have an interest within the release_date subject: "release_date": "1991-06-07".

The question:

Rockset has a perform referred to as DATE_PARSE(), which lets you flip a string formatted date right into a date object. If you wish to order films by simply the 12 months, you should utilize EXTRACT().

Primarily, when you flip your string formatted date right into a date object, you may then extract the 12 months.

At first look, this appears fairly simple to resolve— should you wished to order all of the film titles by the discharge 12 months, you may write one thing like this:

SELECT
      t.title, t.release_date
FROM
    commons.TwtichMovies t
ORDER BY
    EXTRACT(
         YEAR
         FROM
         DATE_PARSE(t.release_date, '%Y-%m-%d')
    ) DESC
;

When working this question, we get a timestamp parsing error:
Error [Query]
Timestamp parse error:

This might imply you’re working with different information varieties that aren’t strings. To examine, you may write one thing like this:

SELECT
      t.title, TYPEOF(t.release_date)
FROM
    commons.TwtichMovies t
WHERE 
    TYPEOF(t.release_date) != 'string'
;

That is what we get again:


sql result

Now, that we all know what’s inflicting the error, we will re-write the question to discard something that’s not a string sort— proper 🤗?

SELECT
      t.title, t.release_date
FROM
    commons.TwtichMovies t
WHERE
    TYPEOF(t.release_date) = 'string';
ORDER BY
     EXTRACT(
          YEAR
          FROM
              DATE_PARSE(t.release_date, '%Y-%m-%d')
     )DESC
;

WRONG 🥺! This truly returns a timestamp parsing error as effectively:

Error [Query]
Timestamp parse error

You are most likely saying to your self, “what the heck.” One case we didn’t think about earlier is that there may very well be empty strings 🤯- If we run the next question:

SELECT DATE_PARSE('', '%Y-%m-%d');

We get the identical timestamp parsing error again:

Error [Query]
Timestamp parse error

Aha.

How will we truly write this question to keep away from the timestamp parsing errors? Right here, we will truly examine the LENGTH() of the string and filter out every part that doesn’t meet the size requirement— so one thing like this:

WHERE LENGTH(t.release_date) = 10

We will additionally TRY_CAST() t.release_date to a string. If the sector worth can’t be changed into a string, a null worth is returned (i.e. it gained’t error out). Placing this all collectively, we will technically write one thing like this:

SELECT
    t.title,
    t.release_date
FROM
    commons.TwtichMovies t
WHERE
    TRY_CAST(t.release_date AS string) shouldn't be null
    AND LENGTH(TRY_CAST(t.release_date AS string)) = 10
ORDER BY
    EXTRACT(
         YEAR
         FROM
         DATE_PARSE(t.release_date, '%Y-%m-%d')
    )
;

Voila! it really works!


panda happy

Through the stream, I truly wrote a extra sophisticated model of this question. The above question and the question within the stream are equal. We additionally wrote queries that mixture! You may catch the complete breakdown of the session beneath:

Embedded content material: https://youtu.be/PGpEsg7Qw7A

TLDR: yow will discover all of the sources it is advisable to get began on Rockset within the developer nook.



[ad_2]

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.