Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Best way for non-alphanumeric

Status
Not open for further replies.

foxmuldr3

Programmer
Jul 19, 2012
166
US
What's the fastest way to take an input string of unknown length, and determine if it ONLY consists entirely of alphanumeric digits?

Code:
lcX  = ALLTRIM(table.field)  && non-binary memo field varying from 0 to ~6,000 bytes
llAN = .t.                   && flag for "is alphanumeric?"
FOR lnI = 1 to LEN(lcX)
    lcC = SUBSTR(lcX,lnI,1)
    IF NOT ISALPHA(lcC) OR NOT ISDIGIT(lcC)
        * It is something other than A-Z_a-z_0-9
        llAN = .f.
        EXIT
    ENDIF
NEXT
* llAN holds the AlphaNumeric status

Better methods?

Best regards,
Rick C. Hodgin
 
Simple reasoning, if you followed, what I said about copying the string to a c 0-terminated string: lcX: One way copying, only need to copy for c to handle...

I would've thought VFP used NULL-terminated strings internally. And even if it didn't, we're warned in Microsoft's by-ref documentation not to change the length of a character variable. That warning should be sufficient to allow us to manipulate only the data that's there, passing as a second parameter to any string passed by reference, its length.

As such, I thought "by ref" meant it was using VFP's own internal memory address and not doing the copy, but rather any updates would've hit the original source. I had thought passing by-value made a copy, which was ultimately discarded.

Apparently in both cases VFP makes a copy, and as you say in the by-ref case it then copies the copy back atop the original.

Such a mechanism seems completely unnecessary, and is obviously harming performance. I can't wait to see VFrP's memory performance alongside VFP's -- especially when coordinating with external DLLs.

Best regards,
Rick C. Hodgin
 
Hi again, Rick.

Sorry, I was ill for a few days. I'm feeling better now.

>I would've thought VFP used NULL-terminated strings internally.

It doesn't, but when dealing with a DLL via DECLARE, VFP needs to convert VFP internal strings for this outside world. The explanation is long winded, but now I feel good enough and concentrated enough to give it:

As you know FP 2.6 and C, I assume you also know the Library Construction Kit to create FLLs. This is still possible and the main ingredients are in the Home()+"Samples\API" folder, mainly winapims.lib and some .h files plus of course a C/C++ compiler. The help has lots of samples around the helper functions of this lib.

A starting point is here:
[URL unfurl="true"]http://msdn.microsoft.com/en-us/library/sac2369k%28v=vs.80%29.aspx[/url]

And this hint's on variables internal structure:
[URL unfurl="true"]http://msdn.microsoft.com/en-us/library/e9261kx6%28v=vs.80%29.aspx[/url]

This describes the only parameter you get passed into any FLL function is ParamBlk *param, a pointer to a parameter block structure. This is described as:

Code:
typedef struct {
   short int pCount;      /* number of parameters passed */
   Parameter p[1];         /* pCount parameters */
} ParamBlk;

The Parameter of that struct is it's own struct:

Code:
typedef union {
   Value val;
   Locator loc;
} Parameter;

And that can be thought of the structure of a VFP internal variable, too. Though we never get explained, if that is really the internal storage, we get this struct passed and it's very likely also the internal storage.

There's of course more to learn, two structures Value and Locator are again structs:

Code:
Typedef struct {
   char         ev_type;
   char         ev_padding;
   short         ev_width;
   unsigned      ev_length;
   long         ev_long;
   double         ev_real;
   CCY         ev_currency;
   MHANDLE      ev_handle;
   ULONG         ev_object;
} Value;

typedef struct {
  char  l_type;
  short l_where, /* Database number or -1 for memory */
  l_NTI,      /* Variable name table offset*/
  l_offset,  /* Index into database*/
  l_subs,  /* # subscripts specified 0 <= x <= 2 */
  l_sub1, l_sub2; /* subscript integral values */
} Locator;

This causes more questions than it answers, but for the moment let's concentrate on the Value struct and what of this is used for string parameters or variables:

1. Value.ev_type is 'C' for strings, like Type()/Vartype() return.
2. Value.ev_length is the string length, like LEN() returns. LEN really just needs to return that property of the variable Value struct, it does not need to count bytes untile the null terminator. There is no null terminator needed because of that.
3. The string itself has no room inside the Value (nor Locator) struct. But there is a MHANDLE Value.ev_handle, which stands for a memory address. There is an API function _HandToPtr() in winapims.lib, that translates this handle into a memory address.

And that is where we arrive at what you know as char* in C.

>As such, I thought "by ref" meant it was using VFP's own internal memory address and not doing the copy
Unfortunately not. Even though I just showed there is an address that could be forwarded as is to a DLL, it isn't.

VFP only allocates as much bytes for a variable, as are needed for the structs and in case further memory is needed, also just as much is allocated and pointed to by the MHANDLE. That's simply a char* array with Value.ev_length. But there is no byte for a null terminator char. Indeed the Fox Team could have decided to always add that one byte, but obviously they didn't. It was a different time way back when this was all invented.

As a C function expects a char* to a null terminated string, VFP has to add that missing null termination, by allocating at least 1 byte more, than it already has for it's internal storage, copy the string bytes to that new address and forward that as char* to the DLL function. Then DLL function then can use StrCpy or other typical C string functions working on null terminated strings.

That already explains why there is the need to copy. It's that way and you can actually verify by logging API calls via process monitor by sysinternals. I had a session with Christof Wollenhaupt once, who showed me, you really can see MALLOC being called in between the Fox call of the C DLL function and the DLL running itself.

Furthermore that also explains, there is an inbetween code running between the call of the C function inside foxpro code and the C function itself. In the moment you call DECLARE to declare an external DLL function, that isn't just addding an address to a jumptable. That is creating a stub function handling the fox parameters and translating them to C style parameters.

With an FLL that is a bit different, and this is where that more thorough explanation hopefully helps more than just explaining your original (indirect) question about null termination. The FLL call mechanism also is an overhead, but it does not convert VFP variable structs, these are passed in as ParamBlk. The downside of that is there are quite some bytes prepared as that ParamBlk and that still is an overhead, also you need to call _HandToPtr to get at the string address, but that overhead pays for longer strings.

So the good news is, there is a way to accelerate the check to be even faster than the CHARTRAN solution, at least for longer strings, the overhead to call into a FLL function is lower and not rising linear with string length.

I prepare a VS2010 solution, I'm sure you don't mind. But you have to wait a bit. It's getting late and I still need to recover a little bit.

Bye, Olaf.
 
I understand why VFP is doing it, I just don't think it's a good idea the way they're passing the parameter. They should have introduced a third option (1) by-val, (2) by-ref, and (3) native, where "native" is a form which points to the literal location in memory where the data for that item exists. That way direct updates become possible without the overhead.

My opinion.

I hope you feel better, Olaf. I have been sick the past couple of weeks as well. Feeling better though. Slow recovery this time (age 43).

Best regards,
Rick C. Hodgin
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top