Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Best way to store JPG files

Status
Not open for further replies.

madvalpreston

Technical User
Apr 7, 2011
142
GB
Hi

I am using Mysql as a backend for an Access front end. I want to be able to add site records which have photos of various parts of a building. It is the photo side of things I need advice on

1. What is the best field type that can store JPG files for example in Mysql
2. If MySql cannot handle such files then what is the best way to map or reference to each picture.

Thanks
 
MYSQL can handle images just fine as blobs, however, its strongly suggested you don't store images in a database. Maintenance of images becomes exceedingly complex, and the size of the db tends to be unnecessarily large.

You also require extra code to to get back the image data, and display it back to the viewer.

Personally I suggest you have an image folder, and simply store the image paths in the DB. That way all you do is retrieve a path and pass it to your image container, whatever it may be.

Should you ever need to update an image, overwriting the existing one in the folder is all it takes.




----------------------------------
Phil AKA Vacunita
----------------------------------
Ignorance is not necessarily Bliss, case in point:
Unknown has caused an Unknown Error on Unknown and must be shutdown to prevent damage to Unknown.

Behind the Web, Tips and Tricks for Web Development.
 
I disagre with Phil (ad I always do when this question comes up). I think storing images in the file system stops the site from scaling as without some extra logic or software you are limited to one server.
What I would recomend these days is to look as the Amazon S3 service to store images. Store then there an server them out with normal URL, masses of scale and virtualy no limit.
 
Hi

I disagree with ingresman and agree with Phil.

The only reason to store any binary data ( or any other data that can not be directly searched ) in the database is to apply to them an authorization system only accessible through the database.


Feherke.
 
What good is having images in different servers? Unless you have so many images you need to distribute them I see no actual benefit.

Also should ever require to change db engine, you might need to unload all your blob columns and recreate them, as each Db has a different implementation of the format.



BTW, not even Amazon stores images in a DB:




In any case, I guess it comes down to whether or not you need to be able to easily view the images outside the application or not.

Querying the Db directly for an image will simply provide a bunch of binary data, not an actual image to be viewed.



----------------------------------
Phil AKA Vacunita
----------------------------------
Ignorance is not necessarily Bliss, case in point:
Unknown has caused an Unknown Error on Unknown and must be shutdown to prevent damage to Unknown.

Behind the Web, Tips and Tricks for Web Development.
 
The answer to question 1 is, as Phil wrote is use a BLOB field. Depending on the size of the actual JPG (from my experience > 35kBytes - I have not done any extensive testing to determine how big a BLOB can be and I think it is written somewhere), you may have to use a MEDIUMBLOB field type. There is also a LONGBLOB data type. Oh, and, yep, the database is getting big. Especially since, shame on me, I do not regularly delete records marked for deletion.

The answer to question 2 is apparently open for debate. The answer for you is: "What do you want to do?" As Phil was alluding to, you need 'special code' to show the binary data of the image. Can Access acquire the binary data from the BLOB field and display it? I would think the FORM creation aspect of Access has an IMAGE component one could use to easily display images from a BLOB field.

I store JPGs, BMPs, and PNGs in BLOB fields all the time. I use CodeGear C++ Builder to create Win32 apps and I have no trouble acquiring, storing/streaming, retrieving, and displaying the images, so I find storing images in BLOB fields a convenience. I also store true type fonts and whole, parameterized, SQL queries in BLOB fields. In order to convince people to use my software I have to 'educate' them as to what a database is, including where and how data is stored. When people use my programs they always ask what the path and file name of the database is. I have to maintain some level of mysticism around the database so people are too afraid to touch it, but not too afraid to use it. If I stored the images as individual files and merely stored in the database the path and filename of the image, people would be too tempted to attempt to access the image files directly.

Steve.
 
Some good points Steve, security and image tampering is an issue. M$ share point holds everything in SQLServer (or even access) but that includes word docs and spreadsheets, they argue that physical security is better with with DB's and they have an incremental backup solution to keep dumps small.
My basic view is if I have a single web server ok I can store my images there no problems. If I start to scale out (i.e. add more servers) where do I store my images then?. I have to make a choice where to store them and also to retrieve them. I could store them on every server but that's wasteful. You could have an image server but it has to be accessable from all the web servers and introduces another failure point.
The code to get the image wouldnt be too bad, all it has to do is read the image from the DB set the content type and write all the bytes to the output stream.
I will concede that as the number of images gets large the database will get larger and there may be caching issues so this is why I would now design to hold a link in the DB pointing back to Amamzon S3. It's interesting that Amazon don't yet use S3 (they will eventualy).
 
Hi

ingresman said:
The code to get the image wouldnt be too bad, all it has to do is read the image from the DB set the content type and write all the bytes to the output stream.
Only if you not care about caching.

Otherwise you will have to also implement sending the [tt]Last-Modified[/tt] and/or [tt]ETAG[/tt] HTTP response headers and checking the [tt]If-Modified-Since[/tt] and/or [tt]If-None-Match[/tt] HTTP request headers, along with the managing of related modification times and/or ETags. Things web servers can do themselves, if they are let to.


Feherke.
 
I did say caching might be an issue, When I send images I give it an expirary date well into the future so it never goes back to the server, If I change the image I change it's name so a fresh copy is sent. PC will eventualy delete the unused image.
Check out yahoo dev tips on this technique.
 
No worries.
This is an important issue for dev's and the architecture of any given site I think. For static images like logos yes hold them localy but for images that are uploaded for shoping carts, photo sites etc, the issues of durability, scale, accessability and caching need addressing.
Would anyone like to put forward a solution to handle images on a webfarm and assess how the solution would scale over 10+ servers.
 
I would attack the issue like this:
When ever an image needs to be stored create a surrogate key consisting of the logged in user name + the image name e.g. fredJennyonthelawn.jpg
Create a hash of this key.
Need to stop here, assuming we have 10 web servers the images need to be spread across these 10 servers. So we have to map the hash value across the 10 servers which we might do by taking the modulus of the hash and 10 so we end up with a number between 1 and 10 which we will call server number. We can allocate the image to that server and when we want to get it back do the process in reverse so we identify the server and then issue a call to retrieve it. The obvious way would be via the web server as this takes away the connectivity issue.
But we are left with a problem. When we mod by 10 all is well is good. If we add another server we would then have to mod by 11 which will give us different results to when we use 10. For example if the hash of fredJennyonthelawn.jpg is 38 moding this with 10 gives us 8 but mod 11 gives us 5, in effect our images would be lost. We need to maintain the server number regardless of which physical server the image is stored on.

To get around this it is common to use some indirection or a logical to physical map. You would choose a value (say 50) which will represent the maximum number of servers you would ever scale to so our server number will never change as this is the value we use for the mod calculation. This should be a high value as making it too small will require all the hash values to be recalculated and balanced and being a high number is not an issue.

To illustrate I'll use smaller values of 4 physical servers and logical servers of 12. At a first pass the logical to physical map would be:
logical: 1 2 3 4 5 6 7 8 9 10 11 12
physical: 1 2 3 4 1 2 3 4 1 2 3 4
So logical server 7 would map to physical server 3.

If we now had to add a new physical server to give us 5 our map would look like.
logical: 1 2 3 4 5 6 7 8 9 10 11 12
physical: 1 2 3 4 5 1 2 3 4 5 1 2
So our logical server 7 now is actually physical server 2.

We now have another problem as we the images are now on the wrong server!. This is not a really big issue as introducing the new server probably meant that we were running out of space any way so we should the opportunity to rebalance in any case. So what we have achieved is we can store a scalable number of images and always know where they are, the application doesn't know or care as it only sees the logical server.

Of course the indirection an be done in a database with a pointer. This is just an alternative and gives you some transparent method of method of balancing images to servers (which is where the technique comes from, sharding)


 
I basically agree with Vacunita.

Drop the files onto a shared file system and be done with it.

By using a backing store to hold the images you can share the entire image collection with whatever servers should have access. Filenames should be generated based of criteria you are sure will be unique - this depends on what you are doing... a nearly foolproof way is to MD5( file + utc_time() ).


 
As I say, how would people implement such a solution e.g. scale, connectivity and privacy
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top