Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
News

A Fast Start For openMosix 83

axehind writes "Dr. Moshe Bar recently announced the creation of openMosix, a new OpenSource project. The project has quickly attracted a team of volunteers developers from around the globe and is off to a very fast start. openMosix, is an extension of the Linux kernel. openMosix is a Linux kernel extension for single-system image clustering. openMosix is perfectly scalable and adaptive. Once you have installed openMosix, the nodes in the cluster start talking to one another and the cluster adapts itself to the workload. "
This discussion has been archived. No new comments can be posted.

A Fast Start For openMosix

Comments Filter:
  • by cscx ( 541332 ) on Monday April 15, 2002 @11:48AM (#3343535) Homepage
    Oh wait...
    • it's an old joke, but it has its place here. Does anyone have pointers for info about comparsions between MOSIX x Beowulf? Any fundamental differences, or are they the different implementations of the same idea?
      • The main difference is that Mosix doen't work with threads. You can spawn a separate process on a node and it can migrate to different nodes. But if your application is threaded all the threads will run on one node, or migrating between nodes.
        • actually, I'm curious. What is the practical upshot, since it's different from Beowulf?

          (also: so how's this differ from Appleseed?)
          • The biggest difference between a MOSIX cluster and a beowolf cluster is what they use to do their distributed computing.

            Beowolf usually uses PVM which requires you to do some explicit work with the PVM library. With PVM you also get a better control of something (how tasks distribute to what machines is one). The other thing is that PVM is entirely userspace, and can use a cluster of computers that say has x86 linux boxes and sparcs in it.

            Mosix is almost entirely kernel space. When you use mosix you fork off a process and then communicate using unix domain sockets. You can't use threads beacuse there isn't shared memory support, but you don't have to make any calls to a mosix library to make your program work. The downside is that you can't really controll where you proccess goes and the cluster has to be all the same arch(beacuse the entire proccess migrates rather than the correct executable being called on the remote machine).
  • by jellomizer ( 103300 ) on Monday April 15, 2002 @11:50AM (#3343552)
    Almost 5% of the text is the word openMosix. Is it more or did that post sound like an advertisment to recrute Open Source Developers. Now I am going to read the article.
    • This will probably be seen as a troll, but I quite honestly get the idea that Moshe Bar likes to see his name in print. Everything I've read (thats not everything he's written, just everything I've read) written by Dr. Bar seems overly self-congratulatory and spends too much time in self-promotion as opposed to donating clue to the reader. He has some thoughts on journaling filesystems [linux-mag.com] that are interesting and don't seem to suffer from this problem (as much?).

      I don't get the same impressions from Daniel Robbins of Gentoo [gentoo.org], who wrote Advanced filesystem implementor's guide [ibm.com] for IBM's developerWorks.

    • Is it more or did that post sound like an advertisment to recrute Open Source Developers.

      I think it was more. But who cruted these Open Source Developers in the first place?

  • Hype? (Score:5, Insightful)

    by Raskolnk ( 26414 ) on Monday April 15, 2002 @11:55AM (#3343583)
    openMosix is perfectly scalable and adaptive

    Nothing like a 'perfectly' statement to discredit a story.
    • Re:Hype? (Score:2, Insightful)

      Oooh, you cynic.

      Maybe interpret it as "as perfectly as can be done with existing (read common, cheap) technology" instead.

      Sure, we know Amdahl's law is pretty much like the laws of thermodynamics (the best you can do is break even, and you can't even break even).
      However, unless you are talking about high-budget professional solutions (e.g. Cray, HP Superdomes, most big shit fom Sun, other highly integrated solutions with custom inter-processor/memory communications), you're always going to take this hit, and openMosix has no reason to be worse than other simple solutions. And if it can reach a state where there seems to be no performance improvement without throwing hardware at it, then surely it could be said to have reached perfection? However, that's a lot of "if"s, and is all pie in the sky at the moment; whether it achieves this 'perfection' target remains to be seen.

      YAWAIW.
    • "perfectly scalable" == "scales linearly"
  • by Brento ( 26177 ) <brento.brentozar@com> on Monday April 15, 2002 @11:57AM (#3343593) Homepage
    There's already a mini-howto [sourceforge.net] explaining how to set this up in combination with a Linux Terminal Server. Basically, you end up with a bunch of workstations that actually relieve the server from CPU load. Odd to think that the more diskless workstations you add to your network, the faster it becomes!
  • by nakhla ( 68363 ) on Monday April 15, 2002 @12:01PM (#3343617) Homepage
    This seems like a great technology for an enterprise to take advantage of older hardware. Upgrading your company's desktop PCs? Take the older ones and plug them into your openMosix cluster. If I recall correctly, processes can automatically migrate from node to node based on system load. I know my old had a Unix cluster for all of the CS students to use. It would get seriously bogged down at times, especially around finals. It'd be nice to have something like this which is able to take advantage of older hardware. There were times when a simple 'ls' would take 30 seconds to complete. Certainly this is something that an old 486 node could take care of. [umbc.edu]
    • Not if the reason it was blocking was due to filesystem latency. If everyone on the network does 'ls' simultaniously in separate directories (on network storage) then the population of lag-city will increase dramatically. YAWIAR.
      • Actually, if everybody did a "ls" on one node's file tree, the latency would be a once-only filesystem access followed by whatever network latency there is as the machine sends out the answers. It won't have to touch disk again because Linux will have loaded the page into its cache, we presume.

        Though yes, there certainly are cases where this will be unable to handle the load (as with any computer), but the load level an openMosix cluster can handle will be, for almost all uses, it would seem, much larger than any of the nodes in single.

        -Knots
        • That's why I specified different directories. The caching is fairly fine-grained, and I am expecting the server to have to hop around the disk quite a bit in the process. I've seen as few as 100 users doing nothing but menial things (the machine's only used for reading mail and running telnet mainly, very low CPU requirement) slow a HP-UX system to a halt.

          However, you're right, second time they do it, it should be instant from the cache.

          YAWIAR.
  • by Anonymous Coward
    Migrating live processes between boxes with open sockets is the last big obstacle to OpenMOSIX being used in large web farms, as far as I know. It seems that OpenMOSIX is geared more to scientific computation problems with IO only at the beginning or end of the batch job. If a MOSIX process has a lot of I/O, it stays on the same box and is never migrated.
  • The bit I am curious about is - if Mosix were GPL, and presumably contributed to by various of the now OpenMosix researchers, how did it become closed?
    I have found This link [debian.org], but that suggests a code fork rather than a revival-after-closing-source.
    • Here is some background:

      http://foundries.sourceforge.net/clusters/index. pl ?node_id=41457&page=1&lastnode_id=41457
      • Well that link is /. so maybe you could explain some here. I was also real curious how this happened.
        • Look at what happened with SSH if you want similar example. If you are the original author (so the copyright is yours) of a GPL project then then you are perfectly entitled to produce a closed source version at some point in the future. It's other people who get the code via the GPL licence who are forced to keep it free.
          • does this include taking modifications of others produced uner GPL or only the original source at the time it was licensed? TIA
            • No, the original author can't create closed source from other people's GPL'd contributions without asking their permission. This situation is covered in the GPL faq [gnu.org].
              • Thankyou. Whew! I thought I was missing something about GPL, I have read that page before but not that specific link. Considering all the items on that page I do not feel to bad not knowing all the in-outs of the GPL.

              • One effect that *does* occur to me - while the actual source of the changes made by Y to X's code are GPLed, any IP in the changes is also owned by Y (and licenced under GPL). As a side effect, presumably not only would X have to reproduce the effect of Y's patches if the new closed source release is to have the same abilities, he would have to show that he either came up with the ideas incorporated in Y's patches independently, or that they would have been obvious to anyone approaching the problem (the usual IP stuff).

                obviously, IANAL, so if anyone wants to take a stab at answering this one..

    • Re:Mosix (Score:2, Insightful)

      by Scarpux ( 556596 )
      Yes, It is a fork. If you look at the project page [sourceforge.net], you'll see that openMosix split from Mosix because Dr Barak wanted to move from the GPL to a proprietary license. Moshe Bar, who worked with Barak on Mosix, took the GPL code and created the openMosix project. I read an interview awhile back with Moshe Bar, but I can't seem to locate it.
      • There's a interview with Moshe in Sourceforge's Clustering foundry where he talks about the fork. Though SF seems to be having trouble right now. http://foundries.sourceforge.net/clusters/
  • OpenMosix is my best friend. I run several machines at home and I can tell you that openMosix rocks my world. I have a (gasp) Windows machine on my KVM switch that I use for playing games. When I am not using it, I run VMWare with a small Linux install and openMosix to take advantage of that machine's processor power. No point in letting it sit idle when I am working on my Linux machines.
    • What do you use it for? i was looking for somekinda way to migrate threads in programs i write to several computers but from what i read it didnt seem like a good idea to run a threaded java-app cause it wouldnt migrate anyways. Think it was because it used shared memory or something like that. I hope i understood it wrong but i stopped looking into it when it seemed that there is no real reason for me to run any kinda cluster unless i want to code "real" parallell code which i dont need for the stuff i code - yet atleast.
  • by Gerdts ( 125105 ) on Monday April 15, 2002 @12:34PM (#3343833)

    Under some workloads, I can go along with the assertion that a MOSIX cluster is just like having a big machine with a lot of CPU's. It seems to be great for those workloads and I would love to try it out. Those loads tend to be multiple long running (more than a few seconds) and not multithreaded. For MOSIX to be most efficient, there also needs to be fewer jobs than there are CPUs to run them.

    Other workloads, however, will not benefit from MOSIX. These statements are based on reading the docs a couple weeks back, not on actual experience.

    Under the MOSIX model, when a process forks, the child may run on the current machine or it may migrate somewhere else. If the job is short lived (ls, echo whatever | sed s/blah/baz, you get the point) MOSIX will perform poorly because it will spend more time trying to figure out where the process should run than would have if it had just run the program on the local host.

    If you need more CPU time than one CPU can provide and your program is multi-threaded, a single multiprocessor machine will also work better. This is because MOSIX does not yet support threads running on different machines. A 128-node cluster of 386's is going to run Netscape slower than a single 486 because you will only be using one 386 CPU.

    For cases where you just have too many jobs for the resources available (CPU or memory), you may be better off with something like Condor [wisc.edu]. It is great for submitting batch jobs, migrating those jobs around, and only running the number of jobs that the system can handle.

    • Under some workloads, I can go along with the assertion that a MOSIX cluster is just like having a big machine with a lot of CPU's. It seems to be great for those workloads and I would love to try it out. Those loads tend to be multiple long running (more than a few seconds) and not multithreaded. For MOSIX to be most efficient, there also needs to be fewer jobs than there are CPUs to run them.

      Other workloads, however, will not benefit from MOSIX. These statements are based on reading the docs a couple weeks back, not on actual experience.

      Speaking from experience, you are pretty much correct. Jobs that use lots of CPU, but have little IO are good for mosix clusters, but jobs that have high IO are bad. The mosix filesystem and other things can partly get around the IO problems if the users plan carefully, but mostly they just want to start 30 jobs and forget about it for a few days.

      There is no reason that a mosix cluster can't be combined with a batch/queueing system. This lets lazy/stupid users run their CPU bound jobs and lets mosix distribute them, but more savy users can script their IO jobs to run on particular machines and use local disk for IO.

      It took a few months for the users of the cluster I setup to get trained into what jobs work well, and which kill the cluster. The problem is that launching 40 "good" jobs on a single machine is not a problem, because they just shoot out to the other nodes, but launching 40 "bad" jobs on a single machine will make that machine almost unusable.

      This can have adverse effects on the cluster if the good jobs were started from the overloaded machine; for example the good jobs might have to check back with their originating machine every few minutes to update a checkpoint file.

      Basically, mosix isn't some magic bullet to solve machine limitations, but it is a very cheap and effective way to solve certain problems.

    • > Under the MOSIX model, when a process forks
      > the child may run on the current machine or
      > it may migrate somewhere else. If the job is
      > short lived (ls, echo whatever | sed
      > s/blah/baz, you get the point) MOSIX will
      > perform poorly because it will
      > spend more time trying to figure out where
      > the process should run than would have if it
      > had just run the program on the local host.

      No. openMOSIX keeps statistisc on what proccess do and use it to decide whether migrating them will be usefull, so short lived jobs will never be considered for migration.

      As a matter of fact the greedy algorythm that openMOSIX use (developed by Prof. Amnon Barak and his team from the Hebrew university) will pretty much avoid migration at all cost unless it is absoultly positive that migration will be usefull: Most of the time the problem is that proccess you want migrate don't migrate and not the other way around.

      It is of course true that openMOSIX has some limitations. For example, proccess using shared memory can't migrate at the moment.You just can't win them all ;-)

    • I work with and help maintain a small mosix (now openMosix) cluster at a university. While mosix only really shines when you are running many long, computation-intensive jobs which are not I/O bound, there is no reason you can't mix mosix with other clustering implementations.

      For example, we run a LAM [lam-mpi.org] MPI implementation on our cluster which allows us to carefully arrange which parts of each job go on each CPU and maximize efficiency. What's more, if we miscalculate, and one job on a heavily loaded machine tears off on a long cpu-burst, mosix will step in and migrate it over to a less loaded system for the duration of the burst.

      None-the-less, it helps to inform your users not to start up 50 I/O bound jobs on one node and expect them to migrate. You end up having to give users access to multiple nodes to help balance load and this reduces security.

      All in all, I've found mosix is very useful if your users know how to code for it. Standard software will typically not benefit too much. That said, if you have a couple of cd-rom drives in your machine, grip performs quite nicely: the ripping takes place wherever the drives are, but the mp3 encoding tends to migrate across the cluster beautifully :).
  • Darn... I thought this said openMOSIS [mosis.com].

    I don't think anyone would mind a sourceforge for chip building (especially free nightly builds!)

    More on topic and to the point - it is good to see that MOSIX tech is now available opensource (stable anyway). Now we have yet another viable option for speeding up our Beowulfs (MOSIX is generally run with PVM/MPI - not as a replacement).

  • by JungleBoy ( 7578 ) on Monday April 15, 2002 @01:01PM (#3343963)
    I tried (vanilla)mosix a while back. It was cool, but had some real world drawbacks. If you start a process on a node and that process opens a socket, opens a file, or uses shared memory, then that process is stuck on that node. So if you start 10 dnet processes on one node, they won't migrate to idle nodes because they have open sockets (to the key server).

    I don't know if this is the case any longer, I heard rumor that all these things were going to be implimented, so it'll be an interesting project to watch.

    Good Luck Open Mosix!

    -The JungleBoy
    • If you start a process on a node and that process opens a socket, opens a file, or uses shared memory, then that process is stuck on that node.

      From what I gathered from the OpenMosix Internals [sf.net] paper (which is very informative, BTW), when a process that has been migrated to another machine wants to perform network or file I/O, it communicates over the network to the UHN (Unique Home-Node), where the actual I/O operation will occur. The same goes for machine-specific system calls (gettimeofday() was used as an example).

      "One drawback of the deputy approach is the extra overhead in the execution of system calls. Additional overhead is incurred on file and network access operations. For example, all network links (sockets) are created in the UHN, thus imposing communication overhead if the processes migrate away from the UHN." There's probably a more specific quote in the paper.

      You seem to be right about processes using shared memory: "For applications using shared memory, such as Web servers or database servers, there will not be any benefit from OpenMosix because all processes accessing said shared memory must resided on the same node."

  • openMosix? (Score:3, Funny)

    by Viking Coder ( 102287 ) on Monday April 15, 2002 @01:06PM (#3343994)
    What openMosix was openMosix the openMosix name openMosix of openMosix that openMosix project openMosix again?
  • I have absolutely no use for clustered machines (my brother is the particle physicist), but I do happen to have a few pentium-class machines lying around my apartment. I'm just starting to learn about practical uses of linux in my home enviornment, and I'm wondering if I can use a Mosix cluster behind my switcher/firewall box on my DSL line.

    If I want to set up apache and FTP access to my network, could I use a Mosix cluster to help distribute the load? Maybe I could use a mosix cluster to speed the rendering of video that I edit on my workstation somehow?

    This stuff is just too cool to have absolutely no practical application in my life.
  • What kind of experiments have people done with kernel compiles on an openMOSIX cluster? If you have 10 computers and do a "make -j 10", will you see a benefit?
  • There's a Mosix package in Debian. What is the difference between this project and the one that produced the Debian package?
  • if I 'open mosix" 4 computers and leave each one as a graphical login terminal to open a gnome desktop, would it make for a "snappy" or fast desktop, or would the I/O slow it down. (asssume 100baseT ethenet)

    if anyone has tried using a cluster for 'end user desktop apps' how does it work out fsater/slower/no diff?
    • I also wish someone would do some research on this. I have several multi-processor computers sitting around doing nothing (or very little), because I can't think of a good use for them. You can't play Quake 3 on a dual Pentium Pro, but it would make for an awesome node in a cluster.

      The problem is that I'm not sure I'd get any benefit from all that work (learning Mosix or Beowulf, implementing it, etc).

      Well, since nobody else seems to be doing it, I guess I'll have to break down and do it one of these days...

      Anyone want to donate a gigabit switch?
  • The article says Web servers and DB servers do not run faster. I would rather know which apps, by name, run faster. After all isn't that why you create a cluster?

    From reading posts, it seems graphic rendering is faster. Darn, I'm interested in Web servers and Database servers :(
  • When are we going to see Mosix being worked on for FreeBSD. This is something that, the entire community could benefit from. Imaging processes migrating between Linux and BSD machines on the network. I know they have said they were working on it ... but I want to know when?
  • In the begining, the Mosix project was for BSDs, but switched to linux, as it was percieved to be the more widespread kernel...

    From the emails I swapped with the fellow in charge of the project (I'm going from memory, and this was towards the end of 1998), they really liked BSD, and all the code was written for the BSD kernel, but had, in the end, decided to rewrite for linux.

    I was crushed, as I had just setup a nice small network of BSD machines (for bandwidth and QoS testing), and really wanted to try it... but, I got over it, and decided clustering wasn't going to address any of my issues anyway.
  • What I'd love to see with an applicatoin like this is clustered processing of video files, like DV or MPEG2 or MPEG4 files.

    Or, render farms needed for those thousands of figures in battles in Lord of The Rings

It is easier to write an incorrect program than understand a correct one.

Working...