2023 and Ruby is still ridiculously slow for some things

I am doing some machine learning stuff for recommendations, and needed to preprocess some train data before feeding it into the recommender. The processing of a dataset done in Ruby was taking up to 20 seconds to process half a million items despite a ton of optimizations; I ported the same processing to a Postgres function (it's not just queries, I need to build a temporary dataset and then another derived from it with some processing that requires looping; I could in theory do it with CTEs but it would be ridiculously complex in comparison). This way it happens in the database directly, and it now takes just 50 milliseconds for the exact same processing. 400_000 times faster than the exact same thing done in Ruby. I still cannot believe the difference. I was planning to port the Ruby code to Crystal due to the difference in speed but I am gonna leave it in Postgres. It's 2023 and for some things Ruby is still ridiculously slow compared to alternatives 😦

Lead Platform Architect at the day job, Ethical Hacker/Bug Bounty Hunter on the side

Comments

  • edited March 2023

    what is Ruby used for? maybe its not the right tool for your use case!

  • havochavoc OGContent Writer

    Isn't everyone and their dog moving to polars for this?

  • @havoc said:
    Isn't everyone and their dog moving to polars for this?

    Polars or Polaris? Not familiar with it and Google doesn’t help

    Lead Platform Architect at the day job, Ethical Hacker/Bug Bounty Hunter on the side

  • havochavoc OGContent Writer

    @vitobotta said:

    @havoc said:
    Isn't everyone and their dog moving to polars for this?

    Polars or Polaris? Not familiar with it and Google doesn’t help

    Rust implementation of dataframes that seems to be taking over from Pandas slowly but surely

    https://github.com/pola-rs/polars

  • @havoc said:

    @vitobotta said:

    @havoc said:
    Isn't everyone and their dog moving to polars for this?

    Polars or Polaris? Not familiar with it and Google doesn’t help

    Rust implementation of dataframes that seems to be taking over from Pandas slowly but surely

    https://github.com/pola-rs/polars

    Gotcha, thanks

    Lead Platform Architect at the day job, Ethical Hacker/Bug Bounty Hunter on the side

  • BTW I am super excited because Google is organizing a workshop on machine learning exclusively for our company in their Helsinki office. I can't wait!

    Lead Platform Architect at the day job, Ethical Hacker/Bug Bounty Hunter on the side

Sign In or Register to comment.