Looking for Guidance on XRay Optimisation for Comprehensive Data Analysis

Hello Everyone :hugs:,

I recently started utilising XRay to improve the process of a massive data analysis project I’m working on. Although I’m impressed by XRay’s capabilities, I’m having some difficulties scaling out the study, and I would be very grateful for any guidance or thoughts from this experienced community.

Here are some project specifics:

Data Size: I’m working with a dataset that has unstructured as well as structured information, with about 10 million rows. Text fields in the unstructured data need to be processed using natural language processing (NLP) methods.

Analysis Goals: Finding patterns and correlations in the data that can inform decisions is the main objective of analysis. Later on, I hope to incorporate some predictive modelling as well.

Present Difficulties:

  1. Performance: XRay’s performance begins to deteriorate as the dataset gets larger. To keep performance high, I’m trying to find better ways to manage resources or optimise queries.

  2. Integration of NLP: It has been a little challenging to integrate NLP procedures into XRay, particularly in terms of preprocessing and analysis. Are there any tools or best practices that complement XRay well for NLP tasks? :thinking:

  3. Visualisation: I need to further customise the built-in visualisations to match specific reporting needs, even if they’re amazing already. Exist any sophisticated methods or plugins which can assist with this? :thinking:

If anyone has had such difficulties or has suggestions on how to make better use of XRay for initiatives of this kind, I would be delighted to hear from them. Any advice, resources, or firsthand mendix knowledge would be really appreciated!

Thank you :pray: in advance for your help and support.