Hi All,
I know there was some talk about this a few months back, but can't seem to find it.
I've begun to discover the difference between ChatGPT and OpenAI API.
Anyway, I have an application that has some contact management built into it that I've been using for yonks. It centers around scanning a business card, and populating various data based on that.
I've had a large routine for years that parses the unstructured data on the card (if you've never had to do this, it is surprisingly more complicated than one might expect, especially for such a limited amount of data).
So I got the "bright idea" first to have GPT do the "look at the card" and give me the data. This actually works extremely well. If you take an image (scan) of a card, pass it to ChatGPT it does a perfect job of parsing the data.
So I though, great, let's automate that... BUT, soon discovered that the OpenAI API doesn't accept images for processing.
Bummer.
But then I thought, ok, actually my OCR engine (TOCR) does a fine job of getting the OCR text from the card, so getting the text wasn't so much the issue (ChatGPT does in fact do a better job, especially on cards with colors that are hard to contrast), but my OCR works 98%-ish of the time. So, why not let it do that stage, and then pass THAT to the OpenAPI API call.
Lots of fumbling around with getting an API key generated, and then asking ChatGPT how to integrate this with VFP9 for what I wanted, led me to this point.
Oh, and I should mention, my "prompt" asks OpenAI to parse data passed to it from a business card into specific field titles. (Then I use these to populate the fields with EASE, and I don't need a complex ID routine any longer). Sorry for the long intro explanation to the problem but here is where we are:
ChatGPT created this function for me:
In fairness, this seems to work (by that I mean, the request is going through, so the API is working).
In my SCAN button, I have this code:
I wasn't getting what I expected, so I stuck a MESSAGEBOX(lcInputData) call after, and it gives me this:
Now, I note that it's asking about JSON... and what we are doing is passing a fixed prompt (instruction) with variable data. The prompt reads as:
"Use these labels: Company, Honorary, Fname, Mname, Lname, Suffix, Qualifications, Title, Department, URL Mobile, Tel, Fax, AltPhone, email, Building, Addr1, Addr2, City, State, Country, Post, Remains. Where remains is any text that is left over that doesn't fit into any of the labels. Structure the following data, with the labels mentioned and ensure a cr lf is at the end of each. If the item is blank, don't include it."
I then append the OCR text from the card scan to that prompt, so it might end up looking something like this (text from a card I just scanned, but I've scrambled the data so it's not real):
Gaëtan Pennylover, RCDD
JAPAN DISTRICT VICE CHAIR
gpennylover@boxyi.jp
Sagami Bldg 2nd floor, 2-15-6 Ginza
Chuo-ku,
Tokyo 104-0061 Japan
+81.2.5432.8888 / www.boxyi.jp
So this is just sent as a big text string. Now I see something about "JSON", but I've no idea how to pass this as JSON text.
I know this isn't specifically a VFP question, but I suspect others may want to use VFP to do similar work with OpenAI, and perhaps (I've seen some talk about JSON here before), there will be someone who can help me with this?
I know there was some talk about this a few months back, but can't seem to find it.
I've begun to discover the difference between ChatGPT and OpenAI API.
Anyway, I have an application that has some contact management built into it that I've been using for yonks. It centers around scanning a business card, and populating various data based on that.
I've had a large routine for years that parses the unstructured data on the card (if you've never had to do this, it is surprisingly more complicated than one might expect, especially for such a limited amount of data).
So I got the "bright idea" first to have GPT do the "look at the card" and give me the data. This actually works extremely well. If you take an image (scan) of a card, pass it to ChatGPT it does a perfect job of parsing the data.
So I though, great, let's automate that... BUT, soon discovered that the OpenAI API doesn't accept images for processing.
Bummer.
But then I thought, ok, actually my OCR engine (TOCR) does a fine job of getting the OCR text from the card, so getting the text wasn't so much the issue (ChatGPT does in fact do a better job, especially on cards with colors that are hard to contrast), but my OCR works 98%-ish of the time. So, why not let it do that stage, and then pass THAT to the OpenAPI API call.
Lots of fumbling around with getting an API key generated, and then asking ChatGPT how to integrate this with VFP9 for what I wanted, led me to this point.
Oh, and I should mention, my "prompt" asks OpenAI to parse data passed to it from a business card into specific field titles. (Then I use these to populate the fields with EASE, and I don't need a complex ID routine any longer). Sorry for the long intro explanation to the problem but here is where we are:
ChatGPT created this function for me:
Code:
FUNCTION SendToOpenAI
LPARAMETERS cPrompt, cAPIKey, cModel
LOCAL oHttp, cURL, cData, cResponse
* Create the HTTP object
oHttp = CREATEOBJECT("MSXML2.ServerXMLHTTP.6.0")
* Determine the correct API endpoint based on the model
IF cModel $ "gpt-3.5-turbo gpt-4"
cURL = "https://api.openai.com/v1/chat/completions"
* Prepare the JSON data for chat-based models
cData = '{'+;
'"model": "' + cModel + '",'+;
'"messages": [{"role": "user", "content": "' + cPrompt + '"}],'+;
'"max_tokens": 100,'+;
'"temperature": 0.7'+;
'}'
ELSE
cURL = "https://api.openai.com/v1/completions"
* Prepare the JSON data for completion models
cData = '{'+;
'"model": "' + cModel + '",'+;
'"prompt": "' + cPrompt + '",'+;
'"max_tokens": 100,'+;
'"temperature": 0.7'+;
'}'
ENDIF
* Open a connection and set headers
oHttp.Open("POST", cURL, .F.)
oHttp.setRequestHeader("Content-Type", "application/json")
oHttp.setRequestHeader("Authorization", "Bearer " + cAPIKey)
* Send the request with data
oHttp.Send(cData)
* Check if the response is successful
IF oHttp.Status = 200
cResponse = oHttp.ResponseText
RETURN cResponse
ELSE
RETURN "Error: " + oHttp.StatusText + " (" + TRANSFORM(oHttp.Status) + ") - " + oHttp.ResponseText
ENDIF
ENDFUNC
In my SCAN button, I have this code:
Code:
lcInputData = SendToOpenAI(TTPARSEBC.OCRTEXT, cOpenAPIKey, "gpt-4o")
I wasn't getting what I expected, so I stuck a MESSAGEBOX(lcInputData) call after, and it gives me this:
Now, I note that it's asking about JSON... and what we are doing is passing a fixed prompt (instruction) with variable data. The prompt reads as:
"Use these labels: Company, Honorary, Fname, Mname, Lname, Suffix, Qualifications, Title, Department, URL Mobile, Tel, Fax, AltPhone, email, Building, Addr1, Addr2, City, State, Country, Post, Remains. Where remains is any text that is left over that doesn't fit into any of the labels. Structure the following data, with the labels mentioned and ensure a cr lf is at the end of each. If the item is blank, don't include it."
I then append the OCR text from the card scan to that prompt, so it might end up looking something like this (text from a card I just scanned, but I've scrambled the data so it's not real):
Gaëtan Pennylover, RCDD
JAPAN DISTRICT VICE CHAIR
gpennylover@boxyi.jp
Sagami Bldg 2nd floor, 2-15-6 Ginza
Chuo-ku,
Tokyo 104-0061 Japan
+81.2.5432.8888 / www.boxyi.jp
So this is just sent as a big text string. Now I see something about "JSON", but I've no idea how to pass this as JSON text.
I know this isn't specifically a VFP question, but I suspect others may want to use VFP to do similar work with OpenAI, and perhaps (I've seen some talk about JSON here before), there will be someone who can help me with this?