First stop: VM’s SCM and related stuff

You want to compile your own VM, don’t you? Compiling the VM just for compiling it and following some instructions is not really helpful, otherwise why don’t you directly download the VM binary ?  My idea with this sequence of posts is that you understand and learn about the VM.

So…in order to compile the VM, you will have to deal with the problem of the VM’s Software Configuration Management. The first time I tried to compile the Pharo/Squeak VM was like 2 years ago. After that, I tried few times more, and most of the times I have some troubles. In addition, in the last months not only there have been a lot of changes related to code versioning and management, but also Cog VM come into play. So….a lot of people is confused where each part of the VM is committed, or what is needed to compile each VM. I will try to clarify all that so that in the next post we can finally compile the VM by ourself.

Since the Interpreter VM and Cog VM are a little different regarding the code management, I will split them.

Interpreter VM

Downloading code

So, if you remember from the previous post, we have 2 parts: VMMaker with the core of the VM, and the platform code. For the VMMaker it is easy: it is the VMMaker package in squeaksource. The platform code is the official SVN. This sound pretty straightforward, doesn’t it ?  but this is not true sometimes. There are several problems (some may probably be because of my ignorance) that I have found with this approach:

  1. The package VMMaker is not self contained, i.e, it has dependencies on other packages (some packages in the same repository and some in others). So…first problem, you need to know which other packages you need. For example, to build the VM you may need also the packages: ‘FFI-Pools’, ‘SharedPool-Speech’, ‘MemoryAccess’, ‘SlangBrowser’, ‘Balloon3D-Plugins’, ‘Plugin-XXX’, etc.
  2. Similar to the previous item, there is not only the problem of knowing which packages are needed, but instead which version. So…how do you know that for ‘VMMaker-dtl.221’ you need ‘FFI-Pools-eem.2’, ‘MemoryAccess-dtl.3’, ‘Balloon3D-Plugins-ar.6’, etc ?  Using just the last version of every package does work all the times.
  3. Sync between VMMaker and platform code. How do you know for each VMMaker version which SVN version you need of the platform code? or vice-versa how do you know which VMMaker version you need for a specific SVN version? once again, relying in the last version is not a reliable solution.
  4. Similar to 3) there is yet another problem: the platform code, as you can imagine, is split in one folder for every platform (see SVN): there is one for UNIX, one for Windows, for MacOS, and for iOS (but forget this one for the moment). Each platform has a “leader” or “maintainer”, which is the person in charge of implementing/modifying the code. The problem raises when there are changes in VMMaker for example, that require changes in all platform code, and this is not changed in all of them. So for example, in UNIX the changes are committed, but not in Mac OS. So…each platform code is not always in sync with the rest. Note that I am not complaining: this is all open-source and we all do our best. I am just telling you the problems I have seen so far.
  5. The previous problem happens not only for the commits in the repository, but also for the VM releases. Most of the times, they are not in sync. Maybe there is a particular platform that releases 5 times in a year, and maybe there is another one every 1 year and a half 🙁
  6. The version of every VM are not in sync. So for Mac for example you have Squeak 4.2.5beta1U, Squeak, Squeak 5.8b4, etc. For UNIX, Squeak-, Squeak-vm-3.7-7, Squeak, etc.  In Windows, SqueakVM-Win32-4.1.1, SqueakVM-Win32-3.11.5, SqueakVM-Win32-3.10.9, etc. So as you can see, they are all completely different, and for me this is complicated since you cannot just refer to a unique VM version.
  7. The SVN repository is restricted, so you cannot commit unless you have authorized access. This could be a good and bad point at the same time.

I want to make it clear: I am not complaining against this, I am just telling the problems I have found, and how certain infrastructure that has been done in the last months helped with some of these issues.

So….you know that VMMaker is just another Monticello package, and you also know that you have to manage versions, dependencies, why not groups, etc…Does that ring a bell with anyone?  YEEES! Of course, Metacello 🙂  So, one thing we did in Pharo (although I guess it is/was also used in Squeak), is to create a Metacello Configuration for VMMaker: ConfigurationOfVMMaker. For those that doesn’t know what Metacello is, it is a Package Management System on top of Monticello, where the ConfigurationOfVMMaker is a class where you can define versions, dependencies, etc, about your project. If you are a Smalltalker and you don’t know anything about Metacello I recommend you to take a look.

Anyhow, with ConfigurationOfVMMaker we solved the first two problems. With Metacello baselines we define all the structural information of the Interpreter VM: which packages are needed (the dependencies), possible groups (not everybody wants to load the same packages), repositories, etc. And with Metacello versions, we can define a whole set of working versions. For example, for ConfigurationOfVMMaker version 1.5 we load ‘VMMaker-dtl.221’, ‘MemoryAccess-dtl.3’, ‘FFI-Pools-eem.2’, etc. This is a set of frozen versions that we known to work properly all together. Notice that creating versions for ConfigurationOfVMMaker should be done by the “VM developers”. In fact, it was done by people like Torsten,  Laurent and me. But the important thing is that the user doesn’t need to do that. The only thing the user needs to do in order to load VMMaker with all its dependencies, and all loading a working version of every package, is to load the Metacello version. Do you want to try by yourself?  Just take this Pharo image, and evaluate:

Deprecation raiseWarning: false.
 Gofer new
 ((Smalltalk at: #ConfigurationOfVMMaker) project version: '1.5') load.

Sorry for the ugly colors… doesn’t have Smalltalk 🙁

Why I told you to download that particular Pharo image? and why I am explicitly loading the version 1.5 instead of using the last one?  Because I want that my posts are reproducible. If you evaluate this instead:

 (Smalltalk at: #ConfigurationOfVMMaker) project lastVersion load.

I cannot guarantee that everything will be working properly. The same with the Pharo image. If you took any Pharo image 1.0, or 1.1 or 1.2, or Squeak 4.2, I am not sure that VMMaker will load correctly. The same if you load another version than 1.5. So…in this case, I am sure (because I test it before posting) that with that Pharo image and that version of ConfigurationOfVMMaker, VMMaker is working properly.

Coming back….the last point 3) is not yet solved, since you cannot know that for a certain SVN version you need XXX version of ConfigurationOfVMMaker, or vice-versa. But we will come to this later on…The rest of the problems are not solved either.

Generating the VM

You need both things to compile the VM: the C platform code that is directly committed in SVN and the generated C code from the VMMaker. Do you always need to translate VMMaker to C ? Not necessary, because the generated code is also committed in the SVN, usually under the “/src” folder, for example here. It is there so that if someone wants to compile, just download both parts and with GCC it compiles the VM. No need to take a Smalltalk image, load VMMaker, and generate sources. So… when is it really needed to generate sources from VMMaker?

  1. When the /src in the SVN is outdated in relation to the platform code.
  2. When you did changes in VMMaker. You can do changes in VMMaker just for fun, for your own project, for testing, etc.
  3. For learning purpose 🙂

So…how do you compile the VM?  yes, of course, using a C compiler…but that’s not enough information! For example, usually you need to place the /src folder (where the output of the generated VMMaker sources go) in a certain place so that it is found by the makefiles. Even more, the problem is that each platform has its own instructions of how to compile. You can find the instructions for UNIX here, for Windows here, and for Mac OS (after searching this info for a long time) it seems (if it is not this please let me know) to be here.

Not only each platform has its own instructions to build the VM, but also some lack support for IDE. For example, it is not easy to b able to compile the VM out of the box with Microsoft Visual Studio or with Appel’s XCode. For example, for XCode, you need a .xcodeproj file for every project. The problem was that most of the times (at least when I tried) this file contained file locations of the commiter (which of course is different from mine). So, at the end, I usually need to do some modifications to the project in order to being able to compile and run the VM from XCode. I am telling you all this so that you can understand the progress we (the community) did in the last months…

Internal and external plugins

Before going further, let me do a little remark: did you remember I told you about the VM plugins?  Like FilePlugin, SocketPlugin, etc. Well, plugins can be compiled in two ways: internal or external. Internal plugins are linked together with the core classical VM, i.e, they are inside the binary file of the VM. External plugins are distributed as separate shared library and the cool thing is that you don’t need to do anything at all to the VM. At runtime the normal/standard VM can just load an external plugin and use it. Whether you should compile a plugin as internal or external is out of scope of this post. What is important here is that:

  • the normal guy that just wants to compiled the VM shouldn’t need to know how each plugin must be compiled.
  • there are some plugins that only work when they are compiled in one of the two ways.

Generating the VMMaker sources

Imagine that for any reason (maybe one of the above mentioned) you need/want to translate VMMaker package to C. How do you do that? The default approach with the Interpreter VM is by using a tool called VMMakerTool, which at the same time it is the name of the class 😉   So…VMMakerTool is a class which is in the VMMaker repository and it is a UI that let you generate the sources. Here you can see a screenshot:

To reproduce the screenshot, just evaluate:

VMMakerTool openInWorld

The tool is pretty cool since it lets you to do a lot of things: choose which plugins to include and choose whether you want them internal or external, you can set the source output directory, the platform code directory, the CPU architecture (32 or 64 bits), etc. This tool is awesome, but from my point of view, it is too much for a non-VM-hacker guy. Why? Because of what I have already told you: the normal user shouldn’t need to know which plugins to include nor if they should be internal or external. At the same time, following some conventions, the directory for platform code and sources could be automatically set.

Fortunately, VMMakerTool is just the UI and it relies in the “model”, which is the VMMaker class (yes, VMMaker is the name of the squeaksurce repository, the name of one of the packages and also one of the classes heheheh).With the class VMMaker we can do the same of VMMakerTool but from code. Example:

| sourcePath |

"The path where I load from SVN"
sourcePath := '/Users/mariano/Pharo/VM/svnSqueakTree/trunk'.

"Generate new sources"
VMMaker default
 platformRootDirectoryName: sourcePath, '/platforms';
 sourceDirectoryName: sourcePath, '/platforms/iOS/vm/src';
 internal: #(

 "lots of plugins more.....I let few just for the example"

 external: #();

So…suppose that someone provides you the list of plugins for every platform, knowing which of them should be internal and which external, and following some conventions everything can be automatic?  Ok….we are going there, don’t worry 😉

Cog VM

The infrastructur for the Cog  VM is a little messy for me so I would try to do my best to explain it. Cog started as a fork of the Interpreter VM. So…imagine that you want to create a fork for VMMaker (in squeaksource) and another fork in the SVN for the platform code. Monticello doesn’t provide real and easy branch support, so Cog needed to do something weird (at least for me). Suppose that a regular version of the VMMaker package is ‘VMMaker-dtl.161’. In this case ‘dtl’ is the initials of the committer, Dave Lewis. So…how does the Cog branch in VMMaker looks like???  they are just normal versions, but whose committer is ‘oscog’ (I guess this is because of Open-Source Cog). Example: ‘VMMaker-oscog.54’. That means that in order to load Cog, you need to open the VMMaker package, and search for a version that matches ‘VMMaker-oscog’. There is where Eliot commits the official Cog versions.

Exercise: Take a Monticello Browser, add the VMMaker repository and browse the version ‘VMMaker-dtl.223’. Then, browse ‘VMMaker-oscog.54’ and notice the difference between them. For example, in ‘VMMaker-oscog.54’ there are several categories that are not even present in ‘VMMaker-dtl.223’, like ‘VMMaker-JIT’, ‘VMMaker-Multithreading’, etc. Even more, the same categories contain different classes.

Now, regarding the branch in the platform code, this is much easier since it is a regular SVN branch which can be found here.

Fortunately, people have also developed a ConfiguraionOfCog which follows the same idea of ConfigurationOfVMMaker.

One difference I found with the regular VM is that Cog is supposed to be translated to C using VMMaker class directly (not VMMakerTool). You can see how to do it in this workspace.

So, in summary, they way to compile Cog VM is more or less the same as the Interpreter VM: you take a Smalltalk image, you load Cog (you can use ConfigurationOfCog), you generate sources, you checkout SVN branch, and finally compile (the instructions of how to build each VM is in the same SVN). Generating the sources may not be necessary if the /src is in sync with /platforms.

Finally, notice also that Eliot usually uploads regular VM builds (Cog VM binaries for all OS) to this url.

New infrastructure

The same way we should thanks Teleplace for Eliot Miranda’s work, we should also thanks INRIA for paying a Pharo engineer: Igor Stasenko. The good news is that since he started a couple of months ago he was not working for Pharo but instead for a new VM infrastructure . What is all this about? I’ll give you only a quick introduction because in the next post we are going to compile the VM using such infrastructure. Disclaimer: this new infrastructure is only for Cog VM and all its variants, but not Interpreter VM. I guess that’s because of the resources/time available.

So…in a nutshell, there are 3 big changes:

  1. Use GIT instead of SVN. There is a new repository for the platform code which is versioned by GIT instead of SVN. There is a new account for CogVM in gitorious. It seems that nowadays if you are not in Git you are not cool, you do not exist. Ok, we are cool now 🙂  No one needs to ask for a blessing, everybody can clone, hack and push/share changes. People can pick the changes without having to have the permissions to publish.
  2. Use CMakeVMMaker instead of VMMakerTool. CMakeVMMaker is a little tool that automates the build. It has two important things: 1) translate VMMaker to C, using the VMMaker class and 2) generate CMake files so that to ease the build. To do this  it automatically assumes (although it can be customized) which plugins are needed and how they are compiled (if internal or external), the needed compiler flags, the directories needed, etc. CMake is an excellent tool for doing cross-platform compiling and for automatic stuff….CMakeVMMaker generates all the necessary files for CMake. For those who doesn’t know what CMake is, imagine one abstract step before makefiles. CMake is a cross-platform, open-source build system where you can define all necessary stuff like directories, compiler flags, etc, in text files. Once you have that, using CMake you can generate different outputs: normal makefiles where you can just use the command “make”, Appel’s XCode project or even Microsoft Visual Studio projects 🙂
  3. Continuous Integration for VMs!!  Can you imagine that for every GIT commit, Mr. Hudson takes the latest PharoCore image, loads the Cog VM, generates sources, and compiles the VM for Windows, Linux and Mac OS ? Come on! isn’t this really cool?  Ok, you don’t believe me? Go to the Pharo CI for CogVM.

In the next post we will see how to use this new infrastructure and how is solves some of the mentioned problems along this post. I want to thanks Esteban, Igor, Dave and all who answer my questions in the mailing lists 🙂


18 thoughts on “First stop: VM’s SCM and related stuff

  1. Thanks Mariano, this is getting very interesting as each post is presented. I just had a little idea of the mess that VM compiling world is, but I see that it really is a mess. Glad that you are documenting that info in one place and glad that are more modern tools to automate and build a VM from zero.

    I’m very surprised that this hasn’t happened before. So many years of building the VM with recipes, and incantations that only a few knew.

    Thanks to you for documenting it and thanks to the people building the VM and tools related.

    1. Yes, you will see that with GIT + CMakeVMMaker + Hudson the procedure is much better. There are still a lot of things to be improved, but at least, from my point of view it was a big progress.

    1. Hi. Yes, I saw it. I am newbie with this stuff of blogs, but if I understood it correclty, since I am hosting my blog in wordpress.COM I cannot put the plugins I want nor to use javascript libraries 🙁
      To do that I should be able to deploy my blog in some other host of wordpress.ORG that let me do and put whatever I want… but I don’t know anyone free…


      1. Thanks Daan! For the moment I will continue using Ruby as the sintax hehehe. If you make it work, and WordPress takes such version, please let me know since I will really appreacite it.

  2. You do really good posts. Every now and then I tried to dive into VM generating. And everytime I was missing the kind of information you provide. As a side effect you’re also making it much more transparent what people like Igor do/achieve. That is important, too.
    Can’t wait for the next post to appear!

    1. Thanks Norbert. Next post is in the owen. But it takes time because I am building the VM step by step also in Linux and Windows, finding “problems” in the while, and trying to fix them hehehe. But soon…(during the weekend probably)

  3. “Sorry for the ugly colors… doesn’t have Smalltalk”

    Mariano, check out the WP-Syntax plugin (e.g. <pre lang="smalltalk"…)

    1. Hi Sean. Thanks for the idea. The problem is that this blog is hosted in wordpress.COM not a .org somewhere. Hence, the integrated version of the plugin does not include Smalltalk as the language. Someone did it, but it is a “fork” of the official plugin and not included in wordpress.COM 🙁

  4. “Mr. Hudson takes the latest PharoCore image, loads the Cog VM, generates sources, and compiles the VM for Windows, Linux and Mac OS”

    This is so cool and needed! Now anyone can get the latest official, compatible source code, VMMaker, Pharo image, and build scripts; and start playing!

    1. Exactly! In fact, exactly as you did this week 🙂 All of the efforts are just for that: make VM hacking easier and get more people involved. It seems now vmHackers := vmHackers + 1 hahah

Leave a Reply