Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

XFRX - Are newer versions any faster... 3

Status
Not open for further replies.

GriffMG

Programmer
Mar 4, 2002
6,288
1
38
FR
I've been using xfrx for a very long time, 2007 I think, and am currently using (I think) version 191.9

I have a project where I need to produce a very large number of reports (call it 2 million) in PDF format and
it looks like it will take about 10 days to complete... which is a tiny bit too long.

As part of my efforts to improve this, I am wondering if anyone has tried any of the more recent versions
before I invest in updating mine.

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

There is no place like G28 X0 Y0 Z0
 
What an interesting question.

As I'm sure you know, there are many approaches to producing PDFs, and a variety of tools available. Have you considered running some timing tests with various PDF tools, using a small subset of the data?

Or using different methods of generating the reports (such as Crystal Reports or Report Smith)? (I'm not advocating either of those, merely suggesting some possibilities.)

It will be interesting to know if you come up with a solution.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Hi Mike

I've been playing with optimising the process - it's not just the reports, there is some file compression and spreadsheet
creation involved to - and I need to get the time down to no more that 1.5 days... the whole thing must be completed within
a weekend to shutdown one system and start up a new one.

I will look at other products, but I am SOOOO comfortable with VFP native report and XFRX generating pdfs that I am not putting
that first in my schedule.

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

There is no place like G28 X0 Y0 Z0
 
Griff, does the spreadsheet creation involve going out to Excel via Automation? If so, that would be a serious bottleneck. In general, calls to Automation servers like Excel can be very slow.

And is your estimate of ten days based on running the reports on a single computer? If so, running them on seven computers in parallet would presumably reduce it to one and half days. (But I imagine there could be logistical difficulties with that.)

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
All options are on the table, and yes, the excel bit is certainly via automation and I think I have that running
as fast as I can make it - loading excel once and clearing and filling a sheet over and over. A straight export would be nice
but I think some of the column headings are not good for field names.

Using more than one machine might help, but then you need a shared drive which is slower or split the data
onto separate machines which takes time...

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

There is no place like G28 X0 Y0 Z0
 
Hi Griff,

I have tried all modern versions of XFRX except the one that came out last week.
I haven’t seen any improvement in speed. The last few versions have a weird bug with XFRX’s export to XLSX, giving corrupted output, but XFRX PDF output is still OK.

I can only advise to use fast hardware, lots of memory, SSD’s and using data in cursors instead of tables from disk.

Regards, Gerrit
 
Well, one thing you can do also with VFP is make use of multiple CPUs by running an EXE or VFP9 itself multiple times and make use of the physical cores. I doubt you have a single core CPU.

The only ressource you share that hinders this to become an actual factor for the performance is shared data and shared drive for outputting the PDF files, so this is a bottleneck. If the main time is spent on gathering the data for a report, then this could speed things up, of course.

You have no good way of determining which CPU core your EXE runs, but here's what works for two VFP sessions with Task Manager manually: In the details tab look for your vfp9.exe, right click and in the context menu pick "set affinity" and pick one CPU. If you have core 0-4, for example, run vfp9.exe 4 times and pick an affinity to each CPU core once. That should make the VFP IDEs on all cores.

I haven't experimented very much with it, but here's a hint that it really works well:
1. Starting two VFP IDEs and keeping both fp9.exe at "<All Processors>" and doing the "deadly" while loop that does not allow events to happen, just Do WHILE .T./ENDDO, the vfp9.exe CPU usage goes to 25%, one full core is only doing this loop. VFP does not make use of more than one CPU core, even if allowed by this setting. The second IDE and vfp9.exe also goes to 25% CPU usage, if I start the deadly loop. The system is still responsive as there are 2 more cores that can still do everything else. But if I now set both vfp9.exe affinity to CPU 0, the two processes are forced to share one core and then each process goes to 11-13% CPU usage, 25% in sum.

Which means: The CPU affinity assignment works and you can manually run as many VFP IDE sessions as you have cores and let them run in parallel as far as they can by assigning each core to one vfp9.exe. The parallelism can not be done fully, as all of them access the same data, not so much of a problem with SSD though.

Now you only should find a way to split up the work to as many chunks as you have CPU cores. For example iterating IDs in N sections and see how far you get to factor N times as many PDFs per minute with N CPU cores

Chriss
 
By the way, since XFRX is extra, maybe only run n-1 VFP sessions and give one core to XFRX. So experiment with how many VFP sessions you get the best throughput, it could be few below the number of cores, especially if your CPU has more cores, like 16 or 32, no idea what's available to you.

Chriss
 
On the Excel part, use COPY TO CSV and then open the file and adjust what you need to. That's likely to be way faster than filling cells one by one, even if you have the sheet set exactly as you want it.

Tamar
 
Tamar,

I agree, but maybe he does use xls.range=@vfparray or creates the XLS by XFRX, isn't that also a main usage of it, just like FRX2Any can do, too?

I wonder why not directly go to PDF, if that's the goal, but that's details.
It's surely a valid question to you, Griff, because "loading excel once and clearing and filling a sheet over and over" is not necessarily the fastest you can go, if filling means setting cell.values one by one, that is indeed slow.

I would also look into the bottleneck of everything. What is actually taking what part of the time to produce a PDF? Is it putting the data together? Is it export to XLS(X) or is it generating the PDF in the end? You can easily find out by logging the milestones of the process. You also talked about compressing, i.e. zipping something?

Anway, a weekend is 48 hours or 172800 seconds, and with literally millions of reports you would only have split seconds for each report.

Chriss
 
Wow

Thank you all for your thoughts.

It's not settled yet, but it is somewhat of the 'sunset' of a situation.



Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

There is no place like G28 X0 Y0 Z0
 
Gerrit Broekhuis said:
I haven’t seen any improvement in speed. The last few versions have a weird bug with XFRX’s export to XLSX, giving corrupted output, but XFRX PDF output is still OK.
Hmm, you have problems with export to MS Excel, but I don't see e-mail in support@eqeus.com. Do you have problem with XLS/XLSPLAIN or NATIVE_FXLSX/NATIVE_PFXLSX?

GriffMG
Latest version is 22.3.
Version 22.0 is faster about o 40%-60%, but it depends on many conditions.
XFRX#INIT with direct output to PDF is faster then using XFRX#LISTENER and transformation to PDF.

XFRX said:
XFRX#LISTENER use transformation always, because use XFF cursor for storing report output.

XFRX 22.4 is faster then XFRX 22.3.
XFRX#LISTENER is faster about cca 40% on new hw, but cca 18% on old hw.
XFRX#INIT is faster about cca 10% on new hw, but cca 1% on old hw.

I don't test one page report, but cca 180 pages report and export to PDF.

XFRX has very fast and simply export cursor/table to XLSX:






mJindrova
 
mmJindrova said:
XFRX has very fast and simple export cursor/table to XLSX

Thanks for pointing that out. I don't know why you even automate Excel, Griff, unless you need more than just a table exported to a sheet, like Excel with formulas or other layouts, but the basis will normally be a table you export to xls(x) and if you have the tool for it, I wonder what exactly you even need to do with Excel automation. Are you embedding an Excel sheet into an FRX as ole object?

Chriss
 
Martina,

We have been in contact recently quite a few times regarding the XLSX issues I had. I sent you sample data, reports and output files. Unfortunately the problem was never solved and in fact became even worse with newer versions and new options. This all in spite of your highly appreciated efforts to assist.

I will test again with the latest XFRX version when I have time for this. I will let you know of course. Who knows it might work again as it did a few versions ago?

Now let’s get back to topic please!

Regards, Gerrit
 
For clarity, in case anyone is reading this with no background, mJindrova is Martina.

Martina is the face of support for XFRX, and I have had positive experiences working with her
in the past - dealing with repeated images last time IIRC - even though my support expired
many years ago.

Thank you Martina for the update.



Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

There is no place like G28 X0 Y0 Z0
 
Gerrit said:
Now let’s get back to topic please!
Weren't we.

Alone the "think" in
Griff said:
I think I have that running as fast as I can make it
Invites advice on how to know that it's fastest.

To me it also poses the question, why you don't simply measure where in the process from data to PDF is your bottleneck that is to be optimized? Even if you're satisfied with running a single report, you're obviously not when running it in bulk. I see three main stages of a) retrieving data, b) procesing it into a report and c) processing that report into a PDF output. There obcioulsy also is a step of generating an XLS sheet, for whatever reasons, maybe it is the report itself or embeddedd into the report, I don't know anything about the report details, but it should be easy to measure where the time goes into the most and optimize these steps. It's sometimes only believing you're already fast and not knowing about other options. Sometimes the options are even known but rejected like EXPORT or COPY TO XLS, just as it doesn't hit the formatting in one step it's still fast to get the data into sheet form. And I'm puzzled if you use automation, when generating XLS is part of XFRX. Are you really aware of all your options and make the best of them?

Disregarding all that your only other step is to run in parallel, and I addressed how you can set the affinity of several VFP sessions to a CPU each and therefore could start some script designed to run through a portion in each session you start at Friday. Before doing that of course experiment and see how much it accelerates things.

But overall, aren't we deep in the topic of this optimization of the performance, Gerrit? I wonder why not.

Chriss
 
I have been running a series of tests with a modest part of the data, trying various ideas to see what improves performance.
Thus far I have reduced the time this subset takes from 150 minutes to about 45 - 50. I can SEE which parts are taking time, even without measuring
them explicitly, and it boils down to the Excel work, the XFRX report and the zipping of the data into packages.

With everyone's suggestions I should be able to improve things further - we'll have to see how it goes.

At the moment, however, I have to hold fire while I await the customers approval to continue.

Thank you all for your ideas.


Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

There is no place like G28 X0 Y0 Z0
 
It might be Chriss, but that is processing a fraction of the actual data - scaling it up to the full data suggests it
will still take 9.75 DAYS to complete - which probably would not be good or sufficient...



Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

There is no place like G28 X0 Y0 Z0
 
Well, sometimes the only other thing you can bring to the table is more paralell processing. And then it might not end at CPU core affinities, but also splitting up data. The data is settld on the friday before the weekend you have, or is it a moving target and you should apply reporting to data as it comes during the weekend, too?

Chriss
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top