Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
News

MySQL FS 198

xcyber writes "Developer, Database Admin and user, MySQL is developing an mysql filesystem for Linux to mount database on Linux as a fs. This is still in development stage and the development team would like to receive comment on this. So please let us know. " "Because you can" dammit. Thats just plain awesome.
This discussion has been archived. No new comments can be posted.

MySQL FS

Comments Filter:
  • Amongst all these other examples, it's probably worth noting that SQL is a declarative language. Basically, it allows you to express the results -- without worrying about the procedure used to generate the results.

  • You're absolutely right that a prototype could be built using current file systems, but said prototype would be SLOW and eat a LOT of space. It's better to use appropriate data structures and algorithms.

    And yes, file systems are databases; they're merely inflexible databases using ANCIENT technology. Not all databases are created equal.

    -Billy
  • by Phexro ( 9814 ) on Tuesday January 16, 2001 @10:13AM (#503740)
    phexro!pyramid:~$ SELECT * from pr0n WHERE sex='f' AND species='goat';
    --
  • by William Tanksley ( 1752 ) on Wednesday January 17, 2001 @10:25AM (#503741)
    Nice. However, first things first: any replacement for the current system has to start by doing all the things the current system does, at least as simply. This is the main reason I think 'cd' is a good command to include.

    It's BAD to try for too much with the first release. If you'd like an 'object system', by all means prototype one using conventional directories; you'll decide quickly that it's little different from modern Unix (remember ioctl!). In other words, an overly complex solution.

    We need a true file system, one in which ioctl isn't needed. See the latest plan9 OS for details.

    -Billy
  • by eric2hill ( 33085 ) <<eric> <at> <ijack.net>> on Tuesday January 16, 2001 @10:14AM (#503742) Homepage
    A while back (a year maybe?) Oracle [oracle.com] announced their iFS [oracle.com] product. Dubbed the Internet file system, it gave file system, IMAP, POP, FTP, and web access to the database through a common software. I haven't had the chance to work with it, and it still may not even be available, but to be able to store files in the database and enforce integrity, it's extremely easy to track revisioning, maintain lists, and perform searches and reports. It seems like wonderful technology that should be a part of every OS, but I'm curious as to performance. Has anyone had any experience with iFS?

  • Sorry for my bad formatting :-)

    Well, the story about this topic was at /.
    and the flames I gotr are still at /.
    Look it up for your own.
    My point was: 90% of the people running linux do not care about GPL, OSD etc. they like free as beer software especialy if its stable and the source is included.

    You can forgett my addition, I only liked to show the attitude several people expressed.

    a'o's
  • A filesystem is also an interface to a flat file. /dev/hda1 is a file with a fixed size and FS is a method of organizing different files inside it.

    --
  • You can almost always access BLOBS (or equivilent fields) from different languages and environments, the problem is that they're all different. DBI, JDBC, ODBC, etc... each one has it's own gotchas. Being able to access these fields as a part of a filesystem gives it the ultimate portability. Even shell scripts can quickly and easily access the database!

    Doug
  • The part I found to be emminently cool was to think of things the OTHER way around -- don't just think of seeing your DB as files, think of seeing your files as a DB.

    Imagine if you could move a bunch of Word200 documents (they ARE XML files, mind you -- you just need an OLE stream decoder a la wv to decode them) into a DBFS directory and provide the DTD for that datatype in a way that you could make SQL calls against the files (records) using the DTD as a table definition.

    You could then move files TRANSPARENTLY in and out of the DB, using the DBFS and automatically index against them.

    A few years ago, I was responsible for getting a bunch of documents on a website to be searched and sorted. I had directory upon directory full of procedure and process documents. One day, I found a program called Xerox DocuShare, (written in Python, BTW) and it used some features that were very similar to this idea.

    Cheers,
    Ken Crandall
  • The latest releases of MySQL *DO* support transactions...
  • Absolutely! Thanks for doing a great job at making my point.

    All one has to do is program for a dynamic, RDBMS-driven website, storing (and retrieving) images on-the-fly to see what I mean, adding your point, above. Of course, websites aren't the only place this is an issue.

    The "ultimate portability" and "Even shell scripts" points are really good.

    AS/400, Palm-OS, Be-OS, Reiser-FS, etc., all do something like this now, as has been pointed out.

    Of course, this now has to be implemented.

    I think they said they'd never land a man on the moon, or something like that. Then, there was this group at NASA.

  • The additional consideration for this line of thinking is:
    • Does MySQL have the same directed and well defined core of development that linux development had and has?[think kernel here]
    • Is the current base of MySQL well written enough, with enough source infrastructure to survive eventual restructuring during concurrent feature enhancement?
    • Is there, as there was with linux, little competition in similar projects offering a similar feature set that might attract more followers or be a better candidate than MySQL for development attention?
    I'm not saying that MySQL lacks any of these. But there are tons of opensourced projects that just needed a bit of getting better that never did because they never really were good enough on a source level. Linux is a lucky case, but take heart, if there hadn't been linux, you still could have run the most fo the gnu system on BSD thanks to GCC.

    Lastly, I'm largly unaware of any linux-only apps that actually make or break a user's choice to use linux vs. any other unix. I think what really makes or breaks the choice is price-point and percieved momentum. pauvre pauvre netBSD.

    -Daniel

  • Have any of you (fs!=db) nay-sayers ever tried to store/retrieve GIFs and JPEGs in a relational database for a web site -- an often daunting, but often necessary task?

    Well, no. It was too daunting for me. But, I'd like to.

    I recently started making a database where I could keep track of all my photos - a "photo database," if you will. (It's here [umich.edu], if you are curious.) I didn't store the photos in the database - primarily because there isn't enough room on the database server. I numbered all the images by hand and serve them from my personal computer - using MySQL and PHP on the database server to access them.

    Anyway, I want to organize my photos into groups - and maybe even subgroups. And I want the groups to be able to overlap. I haven't done this yet, because I don't have the time to (re)impliment a file system inside the database! However, a "dbfs" seems to be exactly what I need for this task. It's close to what I envisioned.

  • Not necessary: I'd assume that when you get to the level below the field level in the dir structure, things would behave like links to items. That is, if I have items '325', '326', etc under Employee_ID, and items 'Smith, A.', 'Anderson, N.', etc, those would point to the same set of unique objects, specifically the data base records. So Mr. HR guy comes along, and if he knows that Smith gets a $100,000 raise, he doesn't have to know he's employee #326, just that he's an Employee, findable under a standard OS find function.

  • I like! I like! Theoretically, you could get faster data access since it would write like a raw device like oracle or MSSql
  • ..and that is why beos is just awesome.
  • Oracle has been doing this for years.

    I've recently been evaluating high end NAS/SAN (network attached storage arrays / storage area networks). The rep from netapp (makes high end NFS based devices) said Oracle is using their devices as the backing for their new E(whatever) service to compete with MS .NET. ... AND in the process, Oracle has forgone using their raw block access methods for, you guessed it, NFS-based Oracle database (to connect to netapp's netfilers) powering their own site. He said Oracle found they could map to netapp's high performance custom NFS file system and start to free up a huge engineering group devoted to optimized raw block access architectures. Sounds pretty f'd to me. And not surprising to hear from netapp, given their whole universe seems to revolve around NAS (NFS) as opposed to SAN (fiber). Anyone out there actually running Oracle on an array via NFS ?

  • Presumably, they just return a failed value. There doesn't seem to be any standard error codes for that type of error, but there's no reason another error code could not be used. Also, don't think exceptions, really - exceptions are cool in Java (I'm guessing that's what you use mostly :) ) but most other languages don't really use them. (I like exceptions better than error return codes, personally. But I have to live with error return codes... *sigh*)

    How about:

    char buf[] = "foo,bar,bally";
    if (write(fd, buf, sizeof(buf))
    if (errno==EINVALDATA) {
    fprintf(STDERR, "Invalid data\n");
    }
    }

    The standard error codes (as specified by man 2 write: EBADF, EINVAL, EFAULT, EPIPE, EAGAIN, EINTR, ENOSPC, and EIO) don't really cover that scenario, but any non-zero value from write indicates an error.

    My question is how to form a path to a row/column (and are you forming paths to rows or columns?) - would it be something like /db_name/table_name/column_name/value or /db_name/table_name/columns_primary_key?

  • I think this is a brilliant idea. It opens up the database to a whole slew of standard commands, but in particular it makes sure the database has a sensible way of being accessed.

    As well, this would be fantastic for configurations (in particular the complex ones of Gnome and KDE) since large amounts of data could be elegantly compartmentalized in a standard way. I find this nifty with the growing complexity of filestructures in these config sets, they would be open to editing and updating through the standard filesystem method, or through a standard SQL query system.

  • filesystem forks? I'm rambling, I mean resource forks of course...
  • You're only partly right.

    You still cannot provide anything universal or that can be done by an end-user. Only having fs access to a db allows for this. First of all, name one universal BLOB that works exactly the same for all db's that support BLOBs (there aren't any). Name one standard SQL command that does all this. Name one standard piece of source code that works in all languages, all the time, for all OSes. Name one totally standard API/interface/protocol/whatever. None of the above can be done.

    Unfortunately, BLOBs are not universal. Nothing works exactly the same way everywhere, all the time. And, let's just assume that you'll be using one DB on one OS all the time in one programming language, just to make things as easy as you claim it is. Things still aren't clean, since you will have to include code repeatedly in all your apps. Possibly, you may have to change your code if your tables change. Assuming you do everything the "right way" and use an interface such as ODBC, JDBC, or DBI/DBD, and assuming you write good OOP that is generic, you still cannot take that code everywhere to all apps all the time, even in the same programming language/OS. There will always be porting, adaptations, and recoding to get this to work everywhere with all your apps. In fact, everything needs to be planned so that all apps that will be using your code should follow the same conventions all the time.

    To avoid this mess and to make life easier for end-users, we could mount the DB as a file system. This gives apps, APIs, libraries, OSes, end-users, etc. the ability to query, read, write, and modify data, even if the platform doesn't even support SQL! By mounting the DB as an FS, you give (nearly) all apps the ability to work on your data (where db data==files), just by being able to open and close files! This is the ULTIMATE layer of abstraction, making access truly UNIVERSAL. (Security restrictions/permissions still should apply, of course.)

    There is absolutely NOTHING universal about what you suggest. Nor do all BLOBs work, even in theory, as you suggest. Nor do end-users benefit. Nor do all apps automagically get access to your data just because you wrote something in language a for database b for OS c to support program d.

    Oracle 8i's IFS, Informix's data blades, MySQL-FS, PGFS, etc. all have been written by the db experts to address these deficiencies. Why do they disagree with most of the people on this post?

    It's because they're right.

  • Section 14.2.2 Has a discussion of File Systems versus Databases written by M. Satyanarayanan (of AFS and CODA fame). He says that although file systems and databases have much in common, there are several areas in which they differ conceptually including encapsulation, naming, and the ratio of search time to usage time. Basically file systems are appropriate when there is high temporaly locality, while databases are used in situations where there is little locality and concurrent read and write sharing of data at a fine grain level are required.
  • trollalicious.

    Nothing about flickering, though.
  • Beos Specs.. scroll down.. [be.com] basically (as anyone thats used beos knows) file information is basically stored in file attributes. ie: Email is just a text document with a sender,recipient,date,etc attribute with the body as the info in the txt document. This is really usefull for mp3's every bit of an id3 tag can be given its own attribute and then you can search based on the attributes and such..
  • by cazz ( 40137 )
    All of the high end RMDBS use raw file system access. By not using a "regular" file system, you gain a huge performance jumps. If MySQL is doing its own locking and recovery, then the overhead from the file system is wasteful.

    Oracle has been doing this for years.

    This is just another important step that MySQL needs hurdle before it is concidered for high end applications.
  • All it does is let you use filesystem calls to access the database. I mean, MySQL uses files that are managed by the Host System's filesystem. So, basically, all it's providing you with is another API that can access the database. That is nice, but I wouldn't necessarily expect anyone to use this for actual data storage. It would seem to me that to use this for the same purposes as a filesystem, it would only add overhead to the process, and provide limited - if any - added data storage benefits.


    -------
  • They even have no foreign key support, no subqueries... In my opinion they have some better things to do than this (although it might be usefull).
  • Select * from /dev/hda1 where filetype="mp3" AND artist="moby" and bpm>120

    that would be SO cool :)
    ---
  • Absolutely!

    And, I'm sure somebody out there probably thinks the whole directory structure is too relational, or db-like, already. Give me a break!

    Yes, a file system puts flat files inside a flat file; drives are more logically mapped anymore than they are physically mapped; RAID, LVM, and partitioning in general divvy that up into pieces; databases put relational data into a bunch of indexed flat files; BLOBs are nothing more than flat files stored in databases.

    So, what is a drive or a partition or a file or a database? The lines are already so blurry you can't tell the difference, anymore, and you wouldn't want to, unless you want to go backward!

  • Hey, that's MySQL we're talking about here. Why do you even care about relational integrity?

  • Actually working along those lines with a hierarchial system for arranging TCL variables. Check dat shit out:

    http://www.etoyoc.com/odie [etoyoc.com]

  • This sounds vaguely similar to how PalmOS apps work.

    THe entire filesystem is based around the idea of a database, where memory chunks are accessed based on the name of the app, etc....
  • Warning: I'm human. Sometimes stuff I post here is wrong. Use your head. Question authority

    this must be one of those times. without some way of querying your file system (except for ls) then you loose the relational aspect of the database. then you just have a filesystem that stores metadata for fast recovery. this is good from a fs point of veiw, but it does nothing to help you find the files you are looking for. it provides no relations between files, and does not store file descriptions (that are useful to a human).

    eg. this is an image file that can be catagorized under political, humor, bill clinton, letch, etc.

    use LaTeX? want an online reference manager that
  • ...but then I could still type. You'd hate that ;)

    OK.. but which is it? You first say if you were to measure it to the "micro second, some difference might be noticed". Then three sentences later you say there is "no difference". You seem confused, and you were right the first time.

    There is a difference (however minute) and it's due to BeOS not actually having the ability to create fields at the FS level. It's trivial to create file examples that show off the flaws in their method and to make peformance suffer. There are filesystems that do it better, through better architecture. Realise this and you'll see my point - that BeOS can improve it's so-called meta FS.

    ps. I've used BeOS for several years now on PPC and x86, programmed for it, and you're probably using some of my apps (here's a hint - Doublin). Oh, and there's much more anger and arrogance without fact or links or proof than anything I'm putting out, believe me.

    -- Eat your greens or I'll hit you!

  • In my opinion it would make more sence to define a general mapping of (R|OO|XML)DBMs into the filesystem.

    If mySQL is only a prototype for that this is fine! However if this is ment as an improvement for mySQL I doubt it is the right step currently, as there are a lot of more demanding features like locking, cashing and general peformance in multiuser environments.

    And of course it would make sence to be able to describe table mappings from existing standard unix configuration files like /etc/passwd into a meta level (the description) to examine/querry normal file content with SQL.

    e.g.: select user, shell from /etc/passwd where shell != /usr/bin/sh

    or better: echo "why don't you use bash?" | mail `select user, "," from /etc/passwd where shell != "/usr/bin/bash"`

    Regards,
    angel'o'sphere
  • A system like this has been implemented at Xerox PARC. It's called Placeless Documents [xerox.com]. It seems to have ended, but there's a follow-on project called Harland [xerox.com] that provides an attribute-based storage mechanism for Java (and is available "for trial use", whatever that means).

    I've seen presentations about Placeless Documents and it's really cool.

  • Q: What is are BLOBs?

    A: Flat files stored in a database!

    Q: What is a database?

    A: Data stored in indexed flat files!

    Q: What is an index?

    A: More data stored in flat files, or a database of metadata that relates to the order of data!

    Q: What is a file system -- Unix/Linux?

    A: Flat files stored in a database (that's why it's called a file system) -- with at least one flat file, such as /dev/hda1!

    Q: What is a hard drive (present day terms).

    A: A device that logically maps data on a physical medium, or a database of sectors!

    Q: What is version control, such as CVS?

    A: A database of changes to flat files.

    Q: Why don't (many) people get this?

    A: Maybe they haven't been a DBA, haven't taken a class about modern operating systems, haven't developed a dynamically-generated website with images, haven't been a frustrated programmer dealing with dynamic and static data simultaneously, don't organize data well, haven't read about new FS'es (such as Reiser-FS), haven't used asset management software, think there really is a distinction between filesystems and db's, store all data in flat files, etc.

  • Since I haven't used SQL in a few years, what command would make !(this SQL command) [e.g., no goats, please]?

    -
    -Be a man. Insult me without using an AC.

  • by jd ( 1658 )
    Hmmm. Interesting concept. Not sure what the use would be. 'where' would be easy to implement in SQL, though.
  • This is a great idea if it's implemented well. The AS/400 is an example of a system that was entirely implemented around the idea of a full-featured DB implemented as a filesystem.
    ...and since it still has one of the best uptime records in the industry, and transaction processing times that consistently rank in the best-of-the-best lists, it's a good platform to imitate. Too often it's overlooked because of the green-screen terminals, but at its core, the AS/400 is easily one of the most advanced implementations of computer technology available to the general public.
  • Well, SQL wont work --I just don't need (or want) the slow down from interpreting and abstracting SQL commands --and yes, I want *all* the speed I can get.

    I am looking for something way lower level. ReiserFS isn't a bad solution, I just dont believe that their plugin API is mature enough to base another project on top of --or am I wrong?
  • Good point -- didn't think about being able to use the "Find" function of an OS.

    However, how is this better than a dedicated web app for HR flacks?

    Find your favorite non-computer-literate person and see if they even KNOW that there's a "Find Files and Folders" in their Start menu? (I'm assuming a Windows-centric office here)

    Like I said, I think it's cool and potentially useful, but probably not as useful for non-nerds

  • I'm not sure what you mean when you say 'staticly' or 'dynamicly'. I suspect, however, that you're assuming that the foundation of a fluid file system is a set of files, directories, and links. It's not. It's almost certainly a relational database, one optimised for the task of getting a set of items (objects, files, whatever) which are categorised under a given set of categories.

    I also don't know what you mean by 'number of possible categories'. I think you're mistaking 'categories' for 'sets of categories'. In my example, "/etc/wtanksle" is a set of two categories; "etc" is a category. I could see some reason to cache the results of category queries; that's an optimization concern, and not my specialty. I don't see any reason to try to precache all possible queries, as you seem to imply.

    Its speed will be almost irrelevant; I predict that it'll be about as fast as the current system, but even if it's hundreds of times slower it'll still be fast enough, since caching is trivial and looking up a file based on a full filespec is almost never done.

    -Billy
  • Would the kernel be anywhere near where it is today if people hadn't gotten others interested by writing intriguing, linux-only apps? Probably not.

    Your analysis is wrong.

    Most anything shipped on linux distro is nothing more than a Unix program PORTED.

    Unless GCC, X and others are 'linux only' apps.
  • by MenTaLguY ( 5483 ) on Tuesday January 16, 2001 @10:26AM (#503800) Homepage

    Much of the ultimate point of ReiserFS is the marriage of databases and filesystems (filesystems are really just a limited sort of database anyway). This is the reason for the all the commercial funding; there are people out there who really want this.

    See Hans Reiser's White Paper [namesys.com] for information on where he's going with this.

    For what it's worth, database filesystems are not a new thing at all. Hans is just planning on accomplishing this in a way that completely preserves the Unix file metaphor and related concepts.

  • This seems like a horrible idea. The idea of both a filesystem and a database is to store data in a (hopefully) secure, long term fashion. However, to call their aproaches radically different is understatement bordering on absurdity.

    A database is about data. The data is partitioned into tables and columns, with a large number of additional constraints (unique, primary key, foreign key and check clauses, for example) to limit the values of the data. Additionaly, the data is strongly typed. In order to access this data, SQL supports very high level commands, like SELECT, INSERT, UPDATE and DELETE.

    The power of a database is most basically in its very high level nature. You, as the user/programmer, do not care where the data is, who else is using it, how it is stored, or what the old values where. The database management system takes care of all of that. Other powerful features of databases include indexes, joins, subselects, real NULLs, aggregate(set) functions, and GROUP BYs (sub-setting).

    Now, contrast this with the low level file/directory structure. In this, you have a hierarchy of directories, each of which contains one or more files. A file is nothing more that a stream of bytes, and the only constraint they have is that they be uniquely named within their directory. Also, a single file can be in more that one directory.

    In order to use a file, the programmer must know where the file is, possibly who else is using it (with lock files, for example), what format the data is stored in and, if they want to be able to undo their actions, the old values. The advantages to files are the plethora of tools for manipulating them (at least in the case of text), and lower startup cost (eg. it takes less time to make a stupid file format than a SQL schema).

    This project is therefore brain-dead as an application development platform. 'But,' I can hear the reply, 'it's useful for users who want to change the data in the database.' Reply: every database accepts SQL, which modifies the data. Some SQL API's I've seen only take two lines of code to retreive some data. And SQL won't shit on your data if you accidentally type it in in the wrong format, it'll conplain, but your data will be safe and secure.

    This is quite possibly the worst idea I've ever heard. Worse than Linux as an Internet Explorer plugin, worse than Napster as a family tree generator, worse than Quake III as a spreadsheet, and even worse than Apache as a VMS shell.

    Not that I have anything personal against it.

  • This sounds really cool, but it seems there could be some problems with implementation. If you build category listings dynamically, this drastically slows down tasks like a simple directory listing (or even locating a file by name), because you start having to do searches. Of course you can speed this up wi/ good indexing, but you still have to pull those indices off the disk and do a fair amount of processing.

    You might be able to build some of the categories statically, but if your fs is truly fluid, then the number of possible categories is gonna be too huge to build and maintain statically. Maybe it needs to be a little less liquid, or maybe you can find a way to indentify commonly accessed files/categories and build that stuff statically, then do everything else dynamically.

    I also think this needs to integrate with rather than replacing a traditional fs. I doubt this method will ever be as efficient in terms of looking up, creating, and deleting files as a traditional fs, so it would be bad for system stuff like /bin, /temp, etc. OTOH, it would be great for home directories where the user is mostly storing documents and a relatively minor performance hit isn't noticeable.
  • Early version of BeOS did use a database FS for the entire system but they dropped it by R4 (I think that's the right version) because of performance issues.

    The early versions of BeOS used a separate database (not very complex) and filesystem, which wound up being very difficult to work with, so eventually they merged the two. The "database" aspects of the BeOS filesystem are more of being able to add (relatively) arbitrary data to particular filetypes, and do searching based on those criteria. It isn't a formal database in any sense of the word.

    Versions of BeOS prior to the Preview Release had a file system and a separate database. Because it was difficult to keep the data in the two separate systems consistant, it was decided that they should merge. This happened in Preview Release 1, and BFS remains relatively unchanged today.

    At the time there was a lot of enthusiasm for the merged design to be a database-based file system, but after a lot of research, Dominic Giampaolo, the engineer doing the design and coding, determined that wasn't going to work. The reason is it becomes too difficult to filter out the files you aren't interested in. There is a lot of organizational value in a hierarchical, structured, traditional file system.

    The design for BFS that was implemented is best described as an "attribute-adorned file system," with a query engine that can search against the attributes, and some indexing to make common queries fast. There's a fairly simple query language (along with simple GUI tools), but it's not as complex or capable as SQL (nor would you really want it to be). You can execute those queries from the command line if you want, which can be pretty useful when piped to another program (much as find is in Unix, but simpler to work with).

  • Isn't it amazing how much more you post to slashdot and stuff right after you break up with a girl?

    Just went through it to. Good Luck ;)
  • by Hard_Code ( 49548 ) on Tuesday January 16, 2001 @12:01PM (#503817)
    Damn, you...both of you stole *my* idea! ;)

    For a long time now I've been thinking about filesystem-as-database concept. We've passed the point where computing is about optimizing hardware resources. It is now about optimizing *user* and *information* resources. If your hardware is blazingly fast, but you are lost in a sea of irrelevant information, you can't do anything. I think that's where the database/meta-filesystem comes in.

    With all this rich content around, we should not be searching for files based on some arbitrary linear categorical name. We should be searching on *attributes*. We should be searching on *association*. E.g., "List all files relating to my work that I have store on my home computer", "Now, of those, show me all files that pertain to status reports". Or "List all data I have on the artists and bands in my music collection". etc.

    This is where plain, flat, hierarchical file systems fail. We need basically a data "repository", and various ways of obtaining information from that repository, based on attributes, categories, mime types, relation to *other* files, etc.
  • Reiser FS is for building a database as a filesystem. See namesys.org .

    Bruce

  • by William Tanksley ( 1752 ) on Tuesday January 16, 2001 @10:31AM (#503820)
    This is exciting on a number of levels, even if the specific database being used isn't my choice; I've been looking for a suitable base for some of my ideas regarding a "fluid file system" (someone else generated a good writeup, calling their ideas a liquid file system [windseye.com], but my name is better :-).

    In my vision, 'documents' would be categorised, and the categories could be viewed in a manner very similar to how we now view directories, except that a file is in more than one folder at a time. A file which is named /etc/wtanksle/ppp.conf could also be referred to as /wtanksle/etc/ppp.conf, or if it's unambiguous, /etc/ppp.conf. /dev/removable gives the list of all removable devices; /dev/scsi gives the SCSI devices (including the removable ones).

    The potential uses are many -- I think it would make a lot of common computer tasks a lot easier.

    Oh well -- anyhow. :-)

    -Billy

  • by Galvatron ( 115029 ) on Tuesday January 16, 2001 @10:33AM (#503824)
    The licensing issues are, for many people, the MOST important. Just because this won't be very good at first doesn't mean it won't improve, and most likely the more popular this project gets, the more work will get done on MySQL.

    Take the linux kernel. Would the kernel be anywhere near where it is today if people hadn't gotten others interested by writing intriguing, linux-only apps? Probably not. Perhaps one day MySQL will evolve to the point where this will be useful, perhaps due to developers attracted by this project.

  • by gimpboy ( 34912 ) <john.m.harrold@ g m a i l . c om> on Tuesday January 16, 2001 @10:52AM (#503825) Homepage
    i've been working on something sort of similar. i upload a file into the database (currently storing the files on a normal partition) and the file has associated with it a file type, description, md5 hash, and a couple other things. now when ever i want a picture of clinton. i do a select where file type is image and description has the word clinton in it. right now i only have a php interface, but i have a friend who's going to do a perl/console interface.

    the cool thing is that i can stream the data via apache to whatever application i want. so i'm going to upload all of my mp3's and build file lists based on the primary keys of the files. then i can stream the data to mpg123/xmms. it works really well, and since i store the md5sum i can prevent myself from storing exact copies of a file.

    i'm useing postgres right now. if they had the ability to mount raw partitions, and get over the 8k limit (this ones coming soon) that would be great. it would make backups easier. now i just have to dump the database and then backup the db dumb and the /files directory to tape.



    use LaTeX? want an online reference manager that
  • The everyday user won't exactly go nuts over it, though.

    The site gives the example "imagine marketroids browsing through the directories to directly access columns and entries" (or words to that effect)

    No way. Hey, don't get me wrong, I LIKE that idea, and it gives me a pretty cool idea for a couple of projects that I'm working on, but think carefully about it: any sufficiently useful database for a large company is also sufficiently large that a directory tree is absolutely the slowest and most confusing way to access data held within a database.

    For example, let's look at two examples:

    • input SQL directly: "update employees set salary = '100000' where employee_id = '325'"
    • browse directory tree: "okay, double-click on "Databases", double-click on "Human-Resources", double-click on "Employees", scroll down until you see "325", double click on "325", then double-click on "Salary". Change the number to "100000"

    It's not bad, but it's not as good. Plus, with good programmers (and good communication between programmers and management), the SQL is so abstracted out, it makes no difference. It gets condensed to a list of names and a checkbox next to the names. Those that get "checked" get a raise to $100,000.

    To be truly useful to non-programmers (or non-analytical thinkers, if you will), the MySQL-FS would have to abstract out so much of the Database, you're back to a filesystem and a set of scripts to update a MySQL database.

    It's cool, but it's not for your regular joe. Beyond a couple of levels, the average computer user gets lost in a heirarchal filesystem -- assuming they don't fill it up with "Untitled Folders" and such.

  • Why does everyone associated with SQL suddenly think that the more things that use that bloated, outdated language, the better? What is the point of this? It is just adding a very thick, slow, and unnescessary layer to something where it is normally essential that it works as quickly as possible.

    --

  • You can compile your SQL statements at the beginning of your program, so that they aren't reinterpreted later. Thus, unless load time is essentual to you, you may be better off with an existing database.

    Regarding the ReiserFS plugin API, you're probably right. However, you don't necessarily need plugins if your project is simple enough. That is to say, if all you're doing is associating a set of data with a key, you make a file (named by the key) and put the data in. Need multiple keys? Use symlinks.

    If your project is of some size, lightweight file support will likely be done before you are (it certainly will if you throw Reiser some money -- he funds his team that way).

    Really, though, I think SQL is almost certainly your best option. The hashing and cacheing done by most modern databases more than makes up for whatever speed is lost to SQL support -- and once again, that speed loss is a load-time thing only if you write your app correctly.
  • The reiserfs folks are working on both a plugin API and additional hooks to add access to its DB features (which have been designed into the low-level stuff for quite some time).
  • Unix: "Everything is a file" Linux: "Everything is a file except for the files, which are records."
  • Hmm... yes, Reiser maybe a way to go. Some benchmarks are in order. But, alas, SQL-based DBs are still too slow for what I am planning. SQL commands/queries etc. maybe interpreted to some intermiadate language/bytecode, *but* the real slowdown comes from the abstraction layers needed to support SQL queries and the like.

    Again, for a normal application you're absolutely right. But if you want to push/crunch a few GBs around in a coupla minutes, every little slowdown counts :-)... That's why most high-end datamining applications don't use RDBMSs...
  • Well, I am glad you're happy, but just about anything implementing a b-tree or skip-list implementation exclusively in RAM will get blazing speeds. The problem of course is, what happens when your application's needs exceed practical RAM sizes (say 7-8GBs these days)?

    I think a well-balanced solution with cache and FS-level access (ReiserFS maybe, in a coupla years from now) will do better. Although, I am really more impressed with SGI's XFS.
  • by Loge ( 83167 ) on Tuesday January 16, 2001 @12:11PM (#503842)
    The AS/400 uses a relational database as a universal data store for all system, application, and user data resources. The database is protected with very fine-grained access privileges and managed with well-defined administrative tools, which dramatically boosts security (since there is only one global security mechanism to manage all system and application resources).

    This approach also simplifies development, which helps to make the AS/400 such a powerful application engine.
  • No, not quite that low-level :-)... B/B* tree implementation and the ability to handle well over 2GB of data comfortably (speed wise) is also a must --say around the neighborhood of ~1TB. Multi-user capabilities are also good, and ACID would be cool, but not a must.

  • The abstraction layers on *your* end or that of the database? The former don't need to exist (one word: "inline") and the latter have been optimized very, very heavily.

    Not all SQL-based databases are alike. If you have the hardware budget for a {SMP,clustering,mainframe} system, a good RDBMS will take advantage of it -- something which might not be said of solutions optimized to perform well on lower-end hardware.

    So, yes -- do your benchmarks, on hardware comparable to what you'll be using for your actual production system. And don't count SQL-based DBs out yet; I would be entirely unsurprised if the overhead which makes them flexible is more than made up for by the heavy optimizations done elsewhere.
  • In some senses. But they're not exactly isomorphic.
    • ReiserFS provides a way that you should be able to efficiently build a DB hierarchically as a set of directories and files, where files are the "leaf nodes" that contain field data, and where you might use symbolic links to represent secondary indices.

      It would provide pretty "weak typing" of a sort of TCLish style where "everything is a string, sort-of."

    • In contrast, MySQL provides a way of representing "structured data," with "strongly typed fields." And the filesystem view provides a convenient way of looking at that data.
    In effect the ReiserFS approach is to provide a way of building "weakly-typed" hierarchical databases; MySQLFS provides a way of putting a conveniently-browsable hierarchy on top of a strongly-typed relational database.

    There are probably a lot of useful applications out there that wouldn't care much about the distinctions. That probably parallels the way that a lot of applications out there don't really care that MySQL does not satisfy the ACID properties or offer triggers, foreign keys, or other such things.

    It also might be regarded as parallelling the way that Lisp-like languages have "strongly-typed data" with dynamic typing, which is a bit the way ReiserFS might be used, whilst "MySQLFS" looks a bit more like the "static strong typing" of ML/Haskell. Which is a rather weaker analogy...

    In any case, the distinctions between ReiserFS-as-DB and MySQLFS are fairly strong. MySQLFS looks a lot, by the way, like the NameSpace concept in Casbah. [casbah.org]

  • by Doc Hopper ( 59070 ) on Tuesday January 16, 2001 @10:34AM (#503850) Homepage Journal
    One advantage of Zope [zope.org] is easy access to the database via FTP. Although this isn't a true "UNIX file system", it can demonstrate the value of using a DB filesystem -- you FTP files up, and with built-in versioning you can view any number of versions via the Zope interface.
    I believe that is one of the goals of ReiserFS as well -- that database vendors use file systems to store data instead of having to use raw disk partitions, or deal with file system overhead plus database overhead...

    Matt Barnson

  • Lets see...

    goober:$ cd /mnt/sqldb
    goober:$ ls
    USER_ID FIRST_NAME LAST_NAME TIMESTAMP
    goober:$ mkdir AGE
    goober:$ echo "Oh crap, there goes my schema!"
    "Oh crap, there goes my schema!"
    goober:$ cd USER_ID
    goober:$ ls
    11023 11025 11044 11055 11092
    goober:$ rm 11023
    goober:$ echo "Wow! I hope that wasnt relational!"
    "Wow! I hope that wasnt relational!"
    goober:$ exit

    Seriously, what type of integrity checking will be enforced in this filesystem?

    I am betting that you either have robust integrity, which would give a completely counterintuitive file system, or lax integrity which would open the doors for all sorts of mischevious errors and data corruption.

  • Umm... Berkeley DB and your favorite C compiler? ;)
  • All of the high end RMDBS use raw file system access. By not using a "regular" file system, you gain a huge performance jumps.

    Which actually has nothing to with the posted article. This article is about creating a virtual filesystem that serves as an interface to the contents of an existing MySQL database (kind of like the /proc filesystem). It doesn't mention anything about creating a special filesystem for MySQL to store its internal data in a more efficient form.

  • This database filesystem might have a real advantage in terms of keeping the database records in a consistent state. Remember that in Unix files have arbitrary data. So a journal filesystem tends to keep the meta-data in a consistent state. They don't do much about the application data written into the files. However if mySQLfs had knowledge of the records being written, presumably it could do a lot of the cool stuff done in main frame OS's to ensure integrity. I don't know the VFS interfaces though, so I'm not sure if this is implementable under the current linux framework.
  • I do regret that the post this is attached to happened.

    This is what happens when you discover that your co-worker has been posting crap as cyb0rq_m0nk3y, and then they feel that it would be funny to post their inane rant on my computer while in the restroom.

    Makes us (the tewwetruggur contingency) look by far dumber than normal.

    again, my apologies.

  • by PureFiction ( 10256 ) on Tuesday January 16, 2001 @11:01AM (#503860)
    Except there is no SQL interface to reiserFS ;)
  • I've never been thrilled with the performance of storing LOBs in any kind of DB -- Oracle, PostgreSQL, or MySQL. The plain-old filesystem tends to do it better and faster. I usually store the path to an object in the DB instead.

    That being said, I have used the LOB stoage in Postgres to implement a versioning system for in-house work (and it worked well enough to prove to me that it's do-able, but not well enough to actually use). The concept is sound, but the implementation needs some work.

    However, using a DB's LOB is a helluva lot better than using CVS for binary objects. CVS seems afflicted with unseemly memory bloat when checking in/out large binary objects...

  • Would this MySQL-based file system be more like BeOS's file system, where files can have arbitrary attributes at the FS level, and you can query based on them?
  • I suppose that file-associated metadata, or simply a file containing metadata, could take care of typing.

    Thanks

    Bruce

  • Well, he's a bit confused but not too far off. The kernel would have been a heck of a lot less interesting if there hadn't already been the GNU system to run on top of it.

    Thanks

    Bruce

  • by Trinition ( 114758 ) on Tuesday January 16, 2001 @10:04AM (#503870) Homepage
    No offense to MySQL, but is it ready for such a task? Last I heard, MySQL didn't have record-level-locking except in some experimental forks. Are there any features lacking from MySQL that might make another database more appropriate (ignoring for the moment the license of them).
  • "Please do not feed the trolls."

    Hell, I will anyway. WTF are you talking about??? GTK *flickers*? Since when? I used to have GTK apps on an old 40MHz 486 (and it was a DLC machine at that...a Cyrix 486 that plugged into a 386 mobo) and I didn't see said flickers *unless the app was poorly written, was doing animation, and didn't double-buffer.*

    Bah, Troll Tech wrote somewhere? The PR department wrote an official release that said something along the lines of "we did not emulate the slow and flickery refresh of GTK(or was it gnome?)" Bullshit. Show me the link. Why would Troll Tech have a position on GNOME, anyway? They don't compete with GNOME in any way. They write a toolkit. I've seen some poorly-written QT programs that display flicker like all hell, and I've seen some GTK apps with decent animation. I've also seen well-written QT apps that display no flickering, and bad GTK apps that do. It depends on the app, I suppose.
  • Think of the file system as a database, yes, but think of the files not as records but as objects in the sense of object-oriented programming. A file is an instance of a particular class of object [*]. Associated with that class are various methods (e.g. associated with an HTML file are methods to view this with netscape, or edit it in my favorite raw text editor, or edit it with my favorite HTML composer, etc.; associated to an executable file is the execution of it.). There is a hierarchy of classes with inheritance (the class of HTML files is a subclass of PLAINTEXT files).

    [*] really the class has other data structures besides the actual file data: e.g. file name, a field for comments about the file, etc., which may vary from class to class

    There are also a variety of classes which serve as containers. The most obvious are what traditionally are directories or desktops. Another container class is "query", which has typical database search methods associated. These can be saved, copied, etc.

    Imagine this: your command line should not be associated with a particular directory location, but rather a particular query. On the command line you most frequently use "cq" ("change query"), "rq" ("restrict query"), and "eq" ("expand query"). So to view the penguin image I know lurks somewhere on my drive, the sequence would be something like

    % cq type=image
    5037 files selected
    % rq *pengiun*
    2 files selected
    % ls
    pengiun_57.jpg
    pengiun.gif
    % ./penguin.gif
    No default action for type "gif"; performing default action for type "image": opening penguin.gif with gimp...

    (And, of course, there are obvious database sorts of features that any sensible graphical file explorer should have...)

    To summarize:
    (1) YES!!! Regardless of how exactly the system implements it, the filesystem should be interfaced as a database.
    (2) Furthermore, don't view files just as RECORDS -- view them as active OBJECTS that are instances within a hierarchical class structure.

    Finally, I think a lot of this can be done just with user interface, without having it explicitly in the filesystem. In fact, things have definately been moving this direction, at least for graphical file explorers. Has anyone added this sort of thing to a command shell?

  • The fact that MySQL doesn't have all those funky "relational features" like foreign keys, triggers, rules, and stored procedures means that this sort of "view" is just about perfect.

    All those complications that MySQL eschews are the sorts of things that would muss up the idea of viewing "database as FS hierarchy."

    And as for the "locking" and "transactional" issues, the point is not terribly different. Filesystems generally don't provide ACID properties; neither does MySQL; that fits together well.

    Mind you, it's quite possible that there's a much bigger controversy concerning stability; based on the MySQLFS web page, it appears that they're passing a CORBA IOR into the kernel. What can that possibly mean other than that they're assuming the presence of the "kORBit" implementation in the kernel? The flaming that surrounded "Why don't we try putting an ORB in the Linux kernel?" was much more vigorous than any flaming about MySQL lacking some ACID features! :-).

  • First off, if you want something with serious DB features but without using SQL, you'd do well to just write a wrapper which adds/looks up entries in an SQL database but can be accessed without SQL. I don't know of anything like this existing right now simply because people who want serious database features (or who are writing a serious database) use SQL.

    Well, almost.

    You can also use ReiserFS -- particularly in a little while, after it impliments lightweight files (thus reducing the amount of overhead for eath record). Yup, ReiserFS has low-level support for relational storage, and lots of Other Cool Stuff. I understand that Squid has accelerated support for it; I've also seen a system for indexing newsgroup articles that uses Reiserfs as its backend. Roughly put, this is possible because of reiserfs's blazing speed when working with small files; it also has a plugin API (in-progress?) and Assorted Other Good Stuff.
  • the Be OS has had a database-like, journaling filesystem since it's very first release. it's the best of both a database, and a file system. I don't know what i would do without it. it makes sorting my thousands upon thousands of mp3s a snap. Add a CD the the collection, fill in the attributes for genre, album, year of release, and so on, and I have a fuly searchable collection.
  • by Shadowlion ( 18254 ) on Tuesday January 16, 2001 @10:39AM (#503884) Homepage
    No. I believe BeOS just has a meta attribute at the FS level filled with supposed attributes.

    BeOS doesn't have one big clump of data that is partitioned; it has a lot of little bits of data. There is nothing "supposed" about the attributes.

    For instance, an "MP3" datatype might have fields for Artist, Title, Album, Year, and Comments. You could then search for any song with the word "Land," performed by a group with "Men" in their name, on any album between the years 1982-1988, with the phrase "sounds like crap" in the comment field.

    There's no way to actually define a new field at the FS level.

    Sure there is. Preferences->Filetypes allows you to add new attributes to a particular filetype, as well as define new filetypes. It's up to the associated applications to do anything meaningful with the new field or fields, but you can pretty much do what you want.

    On the command line, I don't think you can manipulate a global filetype. However, for individual files, you can add your own attributes, delete existing ones, and so on.

    Early version of BeOS did use a database FS for the entire system but they dropped it by R4 (I think that's the right version) because of performance issues.

    The early versions of BeOS used a separate database (not very complex) and filesystem, which wound up being very difficult to work with, so eventually they merged the two. The "database" aspects of the BeOS filesystem are more of being able to add (relatively) arbitrary data to particular filetypes, and do searching based on those criteria. It isn't a formal database in any sense of the word.


    --

  • Well, but maybe you should.

    Sure - a DB accessed as a filesystem doesn't present the full power of the DB through the filesystem API. And sure, a DB filesystem doesn't necessarily have the same performance characteristics as a standard filesystem.

    But there are some very significant applications where a DB presented as a filesystem makes brilliant sense. Here's two simple ones off the top of my head.

    Configuration management. Systems like CVS go to great trouble to get transactional behavior, so that you can't lose code if the program crashes in the middle of an update. If you're using a DBFS, you've got transactionality and rollback for free.

    Micro-applications. There are a lot of simple applications which really need transactionality/rollback facilities, but which can't (either for portability or for size reasons) make use of a complete transactional database facility. Write it to access files, and let the database take care of transactions.

    I don't have anything to do with this project, but I think it's a great idea, because I'm doing almost the same thing with DB2. (Why DB2? Because I work for IBM Research..) I'm building an SCM system, and I don't want the higher layers of my system to need to understand the database or the particular table layout that I'm using. So they access it as a filesystem; downbelow, it's a rock-solid database.

    Of course, all of the above assumes transactionality - which is not yet fully supported by MySQL. So I'd be a little paranoid before using this, to make certain that they're using the transactional tables!

    -Mark
  • Many people like to store binaries in the mysql databases, such as images. This would really help improve their ability to code this.

    As PHP is used in conjunction with MySQL a lot, the functions like move_uploaded_file could be used to store blobs in the database rather than an insert into a blob field making your code much easier to read, but, of course, making the server setup a lot more complicated.

    Without row level locking, however, you will face bottlenecks if you try to do anything besides a mostly read-only file system.
  • Let me go OT for just a second here: does anybody out there know of any open-source systems out there that can do large-scale data storage *without* SQL? I am thinking of a simple C/C++ API that you can use to retrieve and write data from/to tables/fields, nothing much fancier than that. So far, my best be seems to be ColdStore [sourceforge.net]. Any other pointers?
  • I have long wanted a database to be integral to the OS. However the way I have thought of it would be instead of using the filesystem, you would have a table. The record names would be something like "FileName", "Owner", "LastModified", "Text", etc.

    This would be great for text based files and spreadsheets. The possibilities for searching and updating your files would be greatly enhanced by having them maintained by a database.

    I don't think a database would be appropriate for graphics or music files(other than storing pointers to those files, but certainly any text based file would be ideal.

    Given my thoughts on how a database enabled filesystem would work, I don't think very many joins or triggers would be necessary. Most things could be handled by single tables.

    Besides, there is the matter that mySQL doesn't support foreign keys or triggers anyway, and last I checked those features weren't on the to do list. :)

  • Not just that, but for a robust DB, the commit and rollback operations are atomic. It either happens or it doesn't. No half-way measures. So if your disk crashes during a commit operation, you are guaranteed that either the operation went through or not. No mangling of the data.
  • Yes, you don't have to tell me that :-) .
  • SQL is a series of codes used to database interaction.

    SELECT Age, Height, Name FROM MyAddressBook WHERE Age<18

    The above statement will return the fields Age, Height, and Name from the table "MyAddressBook" and limit the values to only those whose age is less than 18.

    There is also ways using SQL to insert data, create tables, and lots and lots of other stuff.
  • "Structured Query Language". A standard way of getting data into and out of DBs.

    Example: "select answer from whizbang where
    qid = 22 and sid = 1"

    whizbang is a table (imagine a spreadsheet that's accessed row-by-row). The query grabs the answer column from the whizbang table, but only in rows where the value in the qid column is 22, and the value in the sid column is 1.

  • There used to be a PostgreSQL filesystem (see this Linux Journal article [linuxjournal.com]).

    Can someone compare this to the MySQL filesystem, or perhaps point me to a place where pgfs can still be downloaded?

  • While this doesn't really seem very useful to me (SQL is after all good at what it does..), it seems silly to make it for just one database. It's easier to use common APIs (ODBC?), or at least something custommade but generic, and try to keep the SQL generic too (nothing fancy is needed for this sort of thing anyway) from the start. It's soo much harder to change after the fact. (Not that they said anything about this, but I assume that means it's as MySQL-specific as it can be..)
  • If a filesystem is a database is a filesystem and getting BLOBs out of a database is so hard, why don't you just store the pathname in the database and the image in the filesystem? I'm having a hard time envisioning storing images in a database as being "often necessary". But what do I know? :-)
  • Open your mind a little further. You're only envisioning the first steps. Certainly being able to transist the database like a filesystem is an extremely powerful interface. Data integrity could be preserved by interacting with the shell the user is in while (s)he is trying to manipulate data. For example:

    goober:$ cd USER_ID

    goober:$ ls
    11023 11025 11044 11055 11092
    goober:$ rm 11023
    Removing USER_ID 11023 will also remove the data for this user in the FIRST_NAME, LAST_NAME, TIMESTAMP, and AGE fields. Do you really want to do this? (Y/n) Yes
    goober:$

    Get the picture? ;-) Essentially, the interface will have to understand how to translate a row into a hierarchical representation. Getting a useful and easy-to-understand interface this way may not be the easiest thing to do. Add the difficulty of resolving foreign key dependencies and such a CLI would get quite confusing. An interesting spin on this would be to bind canned queries to directory names. Still, it has a very limited scope. Complex queries would be difficult to attain through such a CLI.

  • XML.

    Seriously. The average config file is an heirarchical structure as opposed to a table structure. This makes it ideal to use XML.

    I think that making config files usable by XML, one could write custom configuration apps that would work with any program. Example: mythical "gappconf" would simply need the rc file as well as some DTD description of the tags for that program, and it would present the standard tree widget with descriptions of what options you can change, what restrictions you have, etc.

    Now, extending your idea above, I would think it's also possible, but beyond my ability, to write a fs that fakes a directory structure based on an XML file, so that you can decend into directories via your favorite shell and change specific options that you want.

  • You can give just about anything a filesystem interface, its just a matter of how good the implementation is and how useful it is.

    There have already been even FTP and HTTP filesystems for several operating systems if memory serves, and I know there have been a couple other odd ones for BeOS. I nearly did a database FS for Be a few years ago myself.

    Speaking of which, this would be much easier to implement (filesystems are simple to write) and more useful IMHO (because there are already standard APIs to query filesystems and support any number of attributes for files at the OS-Filesystem level) to do for BeOS.

    I'm sure it can be done for linux too, but I have doubts as to the usefulness of it under any OS, much less one where you don't have the luxury of being able to utilize existing attribute support.

    It might give some shortcuts for reading, but writing will likely be very complicated. I don't see a good way to do anything along the lines of joins either. The idea of using "." files/directories will help provide some of that I suppose. Permissions will also be a problem, though I guess you could just go by login name.

    A good reason to have filesystem interfaces to complex resources (like FTP, HTTP, databases, etc) is that it is easy to access things on a filesystem from within just about every programming language on every platform. However, by forcing the normal interfaces to those resources down into what can be done to a filesystem some things also become very complicated. To do those more complicated things will either mean complicated interfaces or programs that give the filesystem information through some other means (ioctl?) or perhaps writing commands to a file within that filesystem ("cat 'SELECT blah FROM blahtable' > /mnt/mysql/queries/testquery" and then looking in /mnt/mysql/queries/testquery/ for the result set, for example).

    In short, I'm sure it'll be very fun to implement and be an interesting toy which may even have some uses...
  • by mikehoskins ( 177074 ) on Tuesday January 16, 2001 @11:22AM (#503924)
    Did you ever hear about BLOBs? Imagine being able to load and unload BLOBs (Binary Large OBjects) in a database in an easier fashion, or, at least, in two different ways.

    Imagine a dynamic web site that uses this! You could simply copy files (especially graphics files) to/from a table easily and look them up via SQL queries! My goodness, the usefulness is extreme, people.

    Have any of you (fs!=db) nay-sayers ever tried to store/retrieve GIFs and JPEGs in a relational database for a web site -- an often daunting, but often necessary task? There are whole article on my to store/retrieve pics as BLOBs via MySQL/PHP on PHPBuilder.com: http://www.phpbuilder.com/columns/florian19991014. php3 and (sorta) http://www.phpbuilder.com/columns/bealers20000904. php3

    So, for those of you who can't get over this idea, try doing sites that store images in databases sometime. An idea like this (one being done by the big RDBMs -- and I work for one of those) is a BOON for websites. It also has many other applications.

    A layer of abstraction is often a good thing for filesystems, and it's where things are headed. IMHO, I think db's could provide BETTER security and make things more distributed, rather than current filesystems. Imagine whole new networked filesystems that are distributed databases. Open your mind. Think about it hard before brushing it aside.

    Besides a db is an fs is a db. It depends on how you look at it, your definition, etc. Is a filesystem relational? Does a db use local storage, often RAW storage. The true computer definition of the two is not all that different. And, SQL is not the only query language out there. Haven't you heard of CLI, which uses commands like cat, ls, echo, rm, mv to handle data? What about those relationships called directories?

    I say, what's the real issue? Raw speed? Oh, wah! Grow up and join the enterprise! Oops, I guess the AS/400 must not be a viable platform; they've been doing this HOW LONG?!?!?

    Q: When are we Linux/Open Source people going to get enterprise-level file and storage management?A: When we get to the point that we implement at least a JFS (if not a full-fledged logged filesystem, good logical volume management, real uninterruptible power, truly fault-tolerant hardware/software clustering, better security, and fully distributed storage management that backups and versions data automagically.

    On a lighter note, MySQL now implements a filesystem. :-)

  • and POW! You've got the religion!

    Currently, programmers are about as far removed from the computers they work on as a chimpanzee is from a spider monkey. An end user is an armadillo.

    People DON'T think in terms of files and folders -- that's a hold over from filing cabinets where it is the only method to wrangle and organize physical documents. People think in terms of tasks and projects.

    If you're working on a project and you remember something you did 2 years ago that you can re-use (or at least build from), you don't say, "It's in that tape we backed up to in the 'Projects for Harold' folder". You say "Didn't we do something similar for Harold a few years ago? Something to do with sand and apricots, I think"

    You pull up Harold's info, look through the list of projects done for him, and find one described as "Peachs and Beaches" -- bingo, you've found it.

  • Would this MySQL-based file system be more like BeOS's file system, where files can have arbitrary attributes at the FS level, and you can query based on them?

    No, the MySQL filesystem will only provide the means to access regular databases as if they were filesystems.

    I imagine that in the root of the mount point there will be a directory tables/ with a list of tables in it. This is not like the BeOS's filesystem that stores files in a table.

    Nontheless (sp.) this is pretty cool, if a stable version of this thing is released (which is probably not for the near feauture) I imagine using this to make backups to a CVS server (like the article suggested).

"Everything should be made as simple as possible, but not simpler." -- Albert Einstein

Working...