Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

HTTP Get - Problem with data caching

Status
Not open for further replies.

Richard Guelzow

Programmer
Jan 6, 2022
22
US
Hello All - I am querying an online database[non-vfp] using an HTTP Get Call. The 1st query works properly and returns the current status of the online database. Each subsequent call to the database returns the data from the initial call despite the fact that the underlying data in the online database has changed. The only way that I am able to obtain refreshed data is to exit VFP and restart the application. I have tried releasing the Microsoft.XMLHTTP object and re-creating it. I have tried releasing the local variable that the HTTP object is tried to. So far nothing has helped, I keep receiving the cached data from the initial query. Here is my code as entered into the VFP Command box:

m.oHTTP = CREATEOBJECT('Microsoft.XMLHTTP')
m.oHTTP.Open("GET", ' .F.)
m.oHTTP.Send()
m.cResult = oHTTP.responseText()
? XMLTOCURSOR(m.cResult,'Smscursor',0)

Thanks in advance.
Richard
 
I think you need to 'fool' your http provider into thinking that cached data will not do, by modifying you request in such a way that it is unique.
a simple number on the end might do

** edit typo in GET line, & changed to ?
Code:
m.mynumber = time()
m.oHTTP = CREATEOBJECT('Microsoft.XMLHTTP')
m.oHTTP.Open("GET", '[URL unfurl="true"]http://147.182.139.15:5000/pickup_sms/+18474991289?time='+m.mynumber,[/URL] .F.)
m.oHTTP.Send()
m.cResult = oHTTP.responseText()
? XMLTOCURSOR(m.cResult,'Smscursor',0)

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
That's an idea, Griff.

You could also read up on request headers about caching, in your request you can specify that no caching is allowed, which does not only effect the local cache but asks the server to not cache this response. And there are further cache related headers...

Microsoft.XMLHTTP also has two functions to look for: 1. getAllresponseheaders and setRequestheader.
It's a vast topic in itself, worth knowing about.

Griffs solution is also described there:
A modern best practice for static resources is to include version/hashes in their URLs, while never modifying the resources — but instead, when necessary, updating the resources with newer versions that have new version-numbers/hashes, so that their URLs are different. That’s called the cache-busting pattern.

The sample hints on a version number in the filename of the URL, but as Griff indicates you can also address the exact same resource with another URL that busts caching, an additional unused parameter is one way, another is making use of # at the end of a URL.

Chriss
 
Thanks Griff & Chris for your assistance.
My initial approach this morning was to follow Griff's suggestion using his code. What I found is that when I add anything to the end of my URL I receive back an empty data set. Next, I added a variable 3rd parameter to the GET request to see if that might work. No luck with that either. I continue to receive cached data.

Next, I looked at Chris's approach and played with m.ohttp.setRequestHeader('Cache-control:','no-cache') and other variants. So far nothing has made a difference. Then, to validate the API, I went outside of VFP and used an online tool called Postman to generate HTTP requests and the API performed perfectly, providing an updated result set with each change to the underlying online data with no caching.

My next approach is to try the West Wind wwHTTP object and see if that performs any differently. Thank you for your help and suggestions and let me know if you have any other thoughts.

Richard
 
OK - I tried the West Wind wwHttp control and it works perfectly, it retrieves fresh data after each underlying server data change. The only downside to this control is addition of new libraries to the application and the licensing fee for the library [more than just this control] [$279].

So... I decided to take another crack at Griff's idea after seeing code using a "?" rather than a "&" before the unique number. Thia seemed to be the trick for me. Thanks again for your help.

Richard

Code:
m.mynumber = STR(100 * RAND(),6,2)
m.oHTTP = CREATEOBJECT('Microsoft.XMLHTTP')
m.oHTTP.Open("GET", '[URL unfurl="true"]http://147.182.139.15:5000/pickup_sms/+18474991289'[/URL] +'?'+ m.mynumber,.F.)
m.oHTTP.Send()
m.cResult = oHTTP.responseText()
? XMLTOCURSOR(m.cResult,'Smscursor',0)
 
I wondered why doesn't work.

There are more things than Cache-Control: no-cache. And by the way, you don't pass in the colon, so the request header might work, if you set it with
m.ohttp.setRequestHeader('Cache-Control','no-cache')

I could also accept a # wouldn't actually count as part of the URL, it's an additional hash that points to an anchor in the response body and also I think related to HTML only, not XML or JSON.

I would have bet, though, that & would still work, as the URL format suggests the phone number in the end is rewritten as request parameter in URL rewriting, and so a ? would be created by that and ? actually wouldn't work, but &. URL rewriting might also dispose any unknown URL part and thus cause the cached response, though all that would still all happen on the server and a cached result usually means you don't get through to the server itself, but get the cached response from a proxy server. Even if your ISP doesn't configure one for your internet connection, a server might be behind several proxies or using a CDN that acts on its behalf. In these cases you need more than just no-cache. There are some mechanisms I don't know in detail, about an identifier that you get from the first response and pass in as reference of which latest result you already know and demand a revalidation to either get a newer state or a response status code confirming this is the latest state and you're not missing out a newer SMS in your case.

That's also a reason this is more useful than just busting the cache mechanism. If the API is rate limited, for example, intelligenter requests making use of such cache identifier mechanisms mean you use up less requests, as only the case the proxy or CDN knows this cache id is stale means your request is forwarded to the server and counts against your rate limit.

Or, in very short, while MDN considers chache busting a best practice, caches are still there to be used, they just have to be used intelligent by both the server and the client side to make better use of the internet overall.

Chriss
 
Richard, although you have your problem already solved, instead of using a "Microsoft.XMLHTTP" object, can you try with an "MSXML2.ServerXMLHTTP.6.0" object?

Both are immediately available in a regular Windows system but use different HTPP libraries (WinInet vs. WinHTTP). ServerXMLHTTP is normally more adequate to your use case, a business application requiring real-time data from an HTTP server.
 
A lot of infrastructures, including browsers, internet accelerators, firewalls, content scanners etc, do not properly
respect the cache headers - to make themselves look quicker.

I find the unique number approach works pretty well, even then sometimes I see clients dragging stuff from a cache somewhere
rather than my server! If it's not in my server log files, that pretty much covers it

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Griff, interesting, how do you see that?

Is it your software using a number sequence and in your server logs you see gaps?

Wouldn't it be interesting to know how and why these intentionally unique URLs don't bust the cache and how it could be used more efficient without the side effect that's unfortunate and led to the idea of cache busting to be the best way to get at latest results? Because to me cache busting is just the idea to discard any cache.

If someone polls for new SMS every minute because the average amount of incoming SMS per hour is more than 60, that's fine, but if you poll every second then, it's draining bandwidth for no good reason. And it really is lost bandwidth for other requests, leads to sites being hosted with more nodes that can respond with fresher data than necessary, or the normal case even becomes you get the same response from the actual server, so you avoid a cache even if it has totally good latest data.

So in essence, wouldn't you like to see less requests in your logs? Even if this doesn't mean costs to you because of flat rates.

Chriss
 
Hi

I use a unique id called 'a' typically and it consists of the session id, the time and a random number - which is about as unique as
it can be.

If a user tells me they are not seeing up to date info (say a login that keeps rejecting) I can follow through the logs, using the
sessionID part and their IP address and see that they call the code that asks for a user id and password and perhaps never call the code that
validates that pair against a database... so if they clicked the submit, and got a response - but didn't ask the server to validate - that
response must come from somewhere in their infrastructure, either the browser is caching it locally, or their content filter is...
or some other proxy between them and me.

Typically, again, this is a 'big company, low IT budget' problem - where they insist that people working from home use a VPN and go through their
head office and IT structure, but haven't invested in engineering it very well (often the VPNs were set up for 50 users, now having 500
on them stretches it all a bit, something has to give).


Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
A login that keeps rejecting is hardly what I would expect to happen, because that surely is something you would never cache, while you may allow a sessionid to have a 15 minute expiration, a login request never can be responded to by a cached result.

After the MDN documentation your server should add Cache-Control response headers that state "private" and "no-store".

Chriss
 
Morning guys,

Maybe this is becoming a case of be careful what you ask for but nonetheless, it's all very interesting. Chris, I completely support the idea of managing the cache via a proper server request rather than cache-busting, but I was unable to get it to work and I need to move on. Here is a list of server directives that I tried, all with no luck:

Code:
m.oHTTP.setRequestHeader('Cache-Control','no-cache')
m.oHTTP.setRequestHeader('Cache-Control','must-revalidate')
m.oHTTP.setRequestHeader('Cache-Control','max-age=0')
m.oHTTP.setRequestHeader('Cache-Control','no-store')

Anyway, this is a low volume application, so I will query the server every 15 seconds. Thanks to everyone who helped out with this, it's been an educational experience.

Richard
 
Well, sorry to misguide you, Richard,

but in the discussion with Griff I mainly concentrated on what the server side can do. There are different things that make sense as values of Cache-Control headers in requests or in responses.

I was also learning from this and went a bit deeper. The identifier I talked about is called etag. If you get one in the first response and put it in next requests somehow, then this helps the proxies, CDN servers or finally the target server to decide to respond with new data or not.

All details are described here:

Chriss
 
Chris,

Not at all, I enjoyed the discussion, and learned a bit on a topic which I had no knowledge of as old-school desktop developer. In addition, I am curious about a line in one of your above posts.

Chris said:
After the MDN documentation your server should add Cache-Control response headers that state "private" and "no-store".

So... If I were to get that code placed on the server, I could forget about all of this? Interesting. I might see if my online guy can do that for me. This is a private use server space. Thanks again. Cheers!

Richard
 
Richard,

you think in the wrong direction. In Griffs case it was a login request, that shouldn't be cached.
In your case, you're only thinking about how to bust the cache, but in your SMS cache ideally you'd only need to "bother" the server, if there is a new SMS. And the cache should therefore be busted from the server, not from you.



Chriss
 
The login was just an example, it's almost the first thing people hit on an application, so it is easy to follow, for me, and understand
as a cogent paradigm, for readers.

I do of course set all the meta headers, but some browsers (M$ IE primarily/historically) and internet accelerators, content filters etc.
do not fully follow the rules.

These are the 'basic' metas set on every page...

Code:
<META http-equiv="Content-Type" content="no-cache">
<META http-equiv="Pragma" content="no-cache">
<META http-equiv="Expires" content="-1">
<META http-equiv="Pragma-directive" content="no-cache">
<META http-equiv="cache-directive" content="no-cache">
<META http-equiv="Cache-Control: must-revalidate,max-age=0,no-store,no-cache,no-transform">


Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Ok, Griff,

so you're mainly avoiding caching. And what you say about "following the rules" unfortunately is true, but I think MDN is wrong in making cache busting a best practice. A best practice must be making best use of caching. In some cases busting the cache should be doable from both sides, simply because of the rule of one developer on the server or client side being the more knowledgeable that should have ways of dealing with caching problems. Even in these days of node package management (npm) and more, you can't rely on any side having a good enough developer knowing all this meta stuff, people concentrate on the HTML or XML/JSON and the features working, not this.

There's no must to change from your strategy, it goes very well with cache busting, as there is nothing cached to be busted if all nodes would play by all rules. But that's simply a sign of bad industry standards, as can be seen by the many ways you nowadays can set up an expiration duration or not. Cache directives have become cache recommendations or wishes.

Playing along in the foolish game of not following the rules is sometimes what you end up, but any time you're in control of all sides you can make the best of it.

One simple way of busting a cache from the server side is to give newer versions of the same resource new names, ie simply a version number in the file name, if it's resources like images. In the case of an API endpoint that should be immutable, you have to use the etag, I think, you have no other choice of making any cache use, the immutable resource locator has to respond with fresher SMS once you have them. But as the server is the one that knows when a new SMS arrives, at that event it can switch from responding to requests with the current etag with status 304 to the new SMS, as it know the old etag, is, ell, old.

But this also can only happen, if client developers know to make use of the etag.

Not pointing fingers, I didn't know about the etag. I also think the MDN should be expert enough to know better, so I do have a pinch of doubt about saying their best practice recommendation is bad. I think I'm right when I think it's just out of being realistic about how caching is undermined in many ways. But it's not jsut an artifact of the times internet bandwidths were lower. It's still a bottleneck. Not for any site and any API, but in the grand scheme you always also burden the overall infrastructure of the internet, even with a small private service you only use for a small internal user group, as soon as it's not bounded by the intranet you cause traffic in the public overall internet.

Chriss
 
To me, the caching is just another M$ hangover - like op locks - there to try and make their product seem as good or better than
others. Now of course, they have largely given up - their browser is 'just' a rebadged chrome... and better for it!

Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Cache Control for internet nodes isn't a MS invention. Though Microsoft is a W3C member (see and filter for M), you mix up the reasoning MS was and still is not that good at networking and tries to make up with caching, of which we are still a victim of oplocks - you mix this up with caching overall being a good idea.

Several technology stacks use caching, not only network, hard drives, also RAM up to CPU caches.

So just kicking off caching is bad style.

I am in favor of server side cache busting, as it's always the server that knows when some resource changed. And knows it first. There's the thing that you are not in full control about what caches parts of your servers responses, but if you therefore don't use cache-control headers you just contribute to this not working the best way it could.

In the SMS case, an etag passed out in a response and passed back in in further requests could help the server, even if no cache or CDN on the way already catches and handles the request, because the server then only looks if the etag is still current or the resource has changed. And in case it's current the response can be empty, just status 304 - not modified.

In an app that needs to react to SMS asap, this would be a case for push notifications, then the whole cache topic would also be obsolete, you'd not request the latest SMS(es), but the server would send them over.


Chriss
 
I suggested that it was not that caching was a problem in itself, it is the ignoring of the cache control headers that causes problems, a thing M$ has done inconsistently
since they started in the browser market.

The thing is we live in an imperfect world, and have workarounds for some of the more troublesome niggles.


Regards

Griff
Keep [Smile]ing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top