ada_answers_gem_logo_fourth

Introducing the AdaCore Blog

We’re pleased to announce the launch of the AdaCore Blog providing an insight into the AdaCore Ecosystem.

http://blog.adacore.com

Jamie Ayre
AdaCore

Gem #161 : So long and thanks for all the memories!

After seven years and 160 iterations, the Ada Gem of the Week series is coming to an end. At least in its current format, as it will be replaced by a blog that will still address technical subjects. But more about that later. For this last Gem we want to reflect on what has been a very successful and widely read series.

The idea behind the Gems series was to provide information around areas of our technology and the Ada programming language. It was specifically successful in bringing to the forefront lesser-known tools and providing help and hints in optimizing your use of GNAT and Ada. Products covered included GNAT Pro, SPARK Pro, GNATstack, GNAT Component Collection, PolyORB, AWS, and GtkAda. Many subject matters were discussed: Ada 2005, Ada 2012, certification, distributed systems, embedded development, safe and secure programming, formal methods and verification, IDEs, libraries and bindings, use of mixed languages, modeling, multicore programming, static analysis, and testing.

The Gems archive can be found here: {page_1398}archive

The following Gems were the most viewed:

Gem #119 : GDB Scripting - Part 1

Gem #128 : Iterators in Ada 2012 - Part 2

Gem #140: Bridging the Endianness Gap

Gem #123: Implicit Dereferencing in Ada 2012

Gem #84: The Distributed Systems Annex 1 - Simple client/server

Gem #142 : Exception-ally

Gem #138 : Master the Command Line - Part 1

Gem #59: Generating Ada bindings for C headers

Gem #127: Iterators in Ada 2012 - Part 1

Gem #117: Design Pattern: Overridable Class Attributes in Ada 2012

Of the 160 published gems, 22 were supplied by outside authors, and we would like to thank them for their kind contributions.

The Gems archive will be kept in its current location, so you can continue to use them and we hope you do so. Although the Gems series in its current format may be ending, we will continue to publish similar content on the upcoming AdaCore blog. The blog will touch on many subjects, including information on new technologies, upcoming product releases, as well as the handy hints and tips many of you found so useful. So it's so long from the whole Gems team, and we look forward to seeing you soon on the AdaCore blog!

Jamie Ayre
AdaCore

Gem #160 : Developing unit tests with GNATtest

Let's get started...

When you invoke GNATtest, it analyzes your program and identifies all library-level subprograms (that is, subprograms declared inside package specs); for each such subprogram, GNATtest creates a unit test skeleton. It also takes care of creating a fully compilable test harness that will encapsulate these unit test skeletons, as well as your own code. When you compile and run the harness, it calls each of the unit tests one by one, then analyzes the results and reports them. All you'll have to do yourself is replace the unit test skeletons with the actual unit test code.

The GNAT distribution provides several examples of GNATtest use; these can be found at [install root]/share/examples/gnattest. Let's use the example "simple" to see what the typical GNATtest use might be.

If you open the Simple project, you will see that it contains a single package that currently declares a single subprogram Inc. Let's develop a test suite for it.

One nice thing about GNATtest is that it's integrated with GPS. So even if you normally call the tool from the command line (details, as usual, can be found in the GNAT User's Guide) -- you don't have to. Instead, you can get started with your test suite literally with one click.

To do that, simply select the GPS command "Tools -> GNATtest -> Generate unit test setup". In the project explorer you'll see your project replaced by the automatically generated project hierarchy, in which your own project has become just one of the dependencies.

Now you can quickly jump to the test-case code by right-clicking on the Inc routine and selecting "GNATtest -> Go to test case". You'll see that currently the test merely contains a stub:

AUnit.Assertions.Assert
  (Gnattest_Generated.Default_Assert_Value, "Test not implemented.");

You can already build and run your harness, but all that the above code will do is cause the test to fail:

$ test_runner
simple.ads:7:4: corresponding test FAILED: Test not implemented. (simple-test_data-tests.adb:25)
1 tests run: 0 passed; 1 failed; 0 crashed.

Let's implement the test, by replacing Gnattest_Generated.Default_Assert_Value with a Boolean expression that actually calls our subprogram and verifies the result, and provides an informative message:

AUnit.Assertions.Assert (Inc (1) = 2, "Incrementation failed.");

After recompiling the test driver you can see that the test now passes.

Keeping the test suite up to date

You can call GNATtest any number of times on an already generated harness -- it will never overwrite test routine bodies, so you won't lose your work. At the same time, it may add more unit test skeletons if you were to add more subprograms to your code.

Let's see how this works. Uncomment the second subprogram and its body in the source files for the project Simple, then run GNATtest again, and compile and run the test driver.

You will see that the old test still passes but there is now a new unimplemented test:

$ test_runner
simple.ads:7:4: corresponding test PASSED
simple.ads:9:4: corresponding test FAILED: Test not implemented. (simple-test_data-tests.adb:46)
2 tests run: 1 passed; 1 failed; 0 crashed.

GNATtest will also warn you that some tests have become dangling if you change the parameter profile or delete subprograms for which unit tests have already been written.

Conclusion

As you can see, GNATtest provides a lightweight and readily available solution to get started with developing unit tests. Unless you already have a unit-testing infrastructure that works for you (oh, and by the way, if you do, did you know that you can also merge existing unit tests into the GNATtest-generated harness?) -- we encourage you to try out GNATtest!

Vasiliy Fofanov
AdaCore

Gem #159 : GPRinstall - Part 2

Let's get started...

In the first installment we described how to install a library project. It's also possible to install a standard project. In that case the spec files and corresponding objects are installed. This is not very different from the library case described in the first part.

But gprinstall can also handle application projects. That is, a project with one or multiple mains specified. In such cases we do not want to install sources, just the executables built as part of the project. Consider this project:

project Prj is

   for Mains use ("mytool.adb");

   ...

end Prj;

To install only mytool (or mytool.exe on Windows platforms), gprinstall's "usage" mode can be set:

$ gprinstall -p --mode=usage prj.gpr

The default mode is "dev" (for development), which is the most common mode to use when installing a library. In development mode we need the specs forming the API of the library. In usage mode, gprinstall will copy the executable mytool to the "bin" directory under the install prefix.

The default prefix is where the compiler is installed. It's possible to change this with the --prefix option:

$ gprinstall -p --prefix=/opt/myapps --mode=usage prj.gpr

All default values can be changed. For example, the default executable directory is "bin" which can be changed with the --exec-subdir option:

$ gprinstall -p --prefix=/opt/myapps --exec-subdir=tools --mode=usage prj.gpr

With this last command, the mytool executable will be installed in the /opt/myapps/tools directory.

Up to now we have only seen ways to install the build artifacts of the project. What if documentation also needs to be installed?

Again gprinstall can take care of that: the Artifacts project attribute can be used to describe items to be installed that are not part of the project build. This attribute must be placed in the Install package of the project:

project Prj is

   for Mains use ("mytool.adb");

   package Install is
      for Artifacts ("doc") use ("doc/README", "doc/VERSION");
      for Artifacts ("share/examples") use ("examples/*");
   end Install;

   ...

end Prj;

So, using this command:

$ gprinstall -p --prefix=/opt/myapps --mode=usage prj.gpr

will install the executable mytool as above, but it will also copy the files doc/README and doc/VERSION to the /opt/myapps/doc directory, and all directories for the examples will be copied into /opt/myapps/examples.

That's all there is to it! Wait, no, in fact... we can now remove further messing around with makefiles!

Pascal Obry
EDF R&D

Gem #158: GPRinstall - Part 1

Let's get started...

It's quite easy to build a library using GPRbuild. GPRbuild handles multiple languages and multiple tool chains. Basically, most projects can be built using a simple command like:

$ gprbuild -p libprj.gpr

Note that -p tells gprbuild to create any missing obj, lib, or exec directories. The same option applies to gprinstall. After some minutes or hours, depending on the size of the code base, we end up with a set of object code and/or libraries. So far so good!

Our next step is to install all of this for use by another project. That is, we have a library that we want to make available to another project. The other project needs the Ada spec files that will be "withed" and the library file (static or dynamic) to link against. We then begin writing a set of makefile rules:

 

install:
        mkdir /some/prefix/include/prj
        mkdir /some/prefix/lib/prj
        cp lib/*.ali /some/prefix/

 

Wait... Do we have cp on Windows? Do we have to set the .ali file read-only? Do we have to copy the bodies? Oh, and of course we need to provide a project that will be installed at the right place to be able to "with" it from another project.

It's starting to look like there's a lot of work ahead...

Or not. In fact, GPRinstall will take care of everything. Yes, it really will take care of everything.

Specifically, GPRinstall will:

  • copy the right sources, given the current naming conventions and exceptions
  • copy the spec to the right place
  • copy the bodies to the right place, only if needed
  • copy the object code or the library to the right place
  • generate a project file automatically and put it in the right place (where gprbuild will look for it)
  • record all installed files and provide an easy way to uninstall exactly what has been installed, no more and no less.
  • provide a way to install multiple builds (debug/production, static/shared libraries are the most common), selectable with a project variable

The "right place" above is, by default, the location where the compiler is installed. Or course, GPRinstall has many switches to indicate where compilation artifacts are to be installed. So, to install the library project above we just have to run:

$ gprinstall -p libprj.gpr

If we have a second build for the shared library:

$ gprbuild -p -XLIBRARY_TYPE=relocatable libprj.gpr

then we can install it with:

$ gprinstall -p -XLIBRARY_TYPE=relocatable --build-name=shared libprj.gpr

Note that we have specified a build name here, which is just a string used to identify a specific installation. This string is added as a possible value in the BUILD variable of the generated project. On the first gprinstall invocation, we have used the default build name string which is "default". So we now have something like this in the generated project:

library project Libprj is
   type BUILD_KIND is ("default", "shared");
   BUILD : BUILD_KIND := external("LIBPRJ_BUILD", "default");

   for Languages use ("Ada");

   case BUILD is
      when "shared" =>
         for Source_Dirs use ("../../include/libprj.shared");
         for Library_Dir use "../../lib/libprj.shared";
         for Library_Kind use "relocatable";
      when "default" =>
         for Source_Dirs use ("../../include/libprj");
         for Library_Dir use "../../lib/libprj";
         for Library_Kind use "static";
   end case;

   ...

And if we decide to actually remove all installations of this library, we just have to run:

$ gprinstall --uninstall libprj

That's about it. Wait, no, in fact there's one more thing... we can now remove all the makefile mess!

Pascal Obry
EDF R&D

Gem #157: Gprbuild and Code Generation

This series follows on from Gems 152 and 155, which we recommend reading first as an introduction.

Let's get started...

Gprbuild was introduced by AdaCore as a replacement for gnatmake. The goal is still to make it easy to build a whole application. However, in contrast to gnatmake, which is limited to Ada, gprbuild is a multilanguage tool, and can happily launch compilers for the various languages you use in your application, and then link all resulting object files together to create a final executable.

Gprbuild's work is described via a project file (which uses the .gpr extension). In contrast to Unix 'make' (a traditional tool used to drive the program build process), a project file gives a static description of a project. You do not have to describe the exact commands to spawn to recompile the files, nor do you need to describe the dependencies of your sources, nor specify when an object file has to be recreated because some of the sources it depends on have changed.

Instead, gprbuild itself has knowledge of various compilable languages. For instance, it knows that for Ada an object file (extension .o) is generated by GNAT from a set of source files having the same base name but a different extension (usually .ads or adb). It also knows that a source file might depend on other source files, and when any of those change, the object file also needs to be regenerated.

Thanks to this built-in knowledge, the project file for a typical pure Ada application is much simpler than the equivalent Makefile would be (unless of course you are calling gnatmake or gprbuild from that Makefile). You only need to point it to the source files, and the rest is automatic. Similar support is available for C and Fortran.

However, nowadays many applications need to first generate part of their sources from higher-level languages, such as UML or Simulink. Fortunately, gprbuild is relatively easy to extend to other languages, and this Gem describes the various steps required to accomplish that.

A custom code generator

Let's first start with a description of the problem.

Let's assume we have an in-house code generator that reads information from one or more XML files and then generates one or more Ada files from those. These generated Ada files should then be compiled together with hand-coded Ada files before the final executable can be linked.

Of course, for optimization purposes, we would like to do the minimal amount of recompilation when no XML file has been modified (that is, not regenerate the Ada files), or when no Ada file has been modified (although that part is already automatically handled by gprbuild).

This example would apply similarly when using code generators such as Lex or Yacc, for instance.

Describing the code generator to gprbuild

Of course, since this is a custom code generator, gprbuild knows nothing about it by default, and we need to set things up. This can be done directly in a project file, but gprbuild provides flexibility beyond that.

Gprbuild itself has no hard-coded knowledge about compilation languages. Instead, it reads all the information it needs from a configuration file (usually with the extension .cgpr). The configuration file is not written by users, but is generated from a set of XML files (the "knowledge base") via a second tool called gprconfig. The high-level behavior is as follows:

     XML files (knowledge base)
                |
            gprconfig
                |
                V
            auto.cgpr       user project (default.gpr)
                |                       |
                \_______________________/
                            |
                        gprbuild
                            |      
                            V
            commands to execute for the build

What we need to do is create a new XML file for the knowledge base.

Let's keep the XML language for pure XML files, in case the application contains some that are unrelated to code generation. Instead, we will "invent" a new language, tentatively named "xml_for_ada". Gprbuild needs to find the sources for this language automatically (they have a standard .xml extension).

The XML file would look like the following:

<!--?xml version="1.0" ?-->
<gprconfig>
   <compiler_description>
      <name>XML_For_Ada</name>
      <executable>codegen</executable>
      <languages>xml_for_ada</languages>
      <version>1.0</version>
   </compiler_description>

   <configuration>
      <compilers>
         <compiler language="xml_for_ada">
      </compilers>
      <config>
    package Naming is
       -- How to recognize XML files
       for Body_Suffix ("xml_for_ada") use ".xml";
    end Naming;

    package Compiler is
       -- describes our code generation, from XML to Ada
       for Driver ("xml_for_ada") use "codegen";
       for Object_File_Suffix ("xml_for_ada") use ".ads";
       for Object_File_Switches ("xml_for_ada") use ("-o", "");
       for Required_Switches ("xml_for_ada") use ("-g");
         -- always use this switch
       for Dependency_Switches ("xml_for_ada") use ("-M");
         -- -M file.d (indicates the dependency file)
    end Compiler;
      </config>
   </configuration>
</gprconfig>

The first part ("compiler_description") describes how to find the executable for the code generator. Its name is "codegen", and it only needs to be located if the user's project indicates that it uses the "xml_for_ada" language. The version number is hard-coded for now, but it would be possible to ask codegen itself for its current version number, which could be used later to change the gprbuild support for it depending on the version.

Assuming a code generator is found, the second part of the XML file ("configuration") indicates what code needs to be added to the gprbuild configuration file. This is where the magic of gprbuild happens.

First, we're letting gprbuild know how to recognize our XML files (".xml" extension).

Second, we're telling it that to process those files, it needs to call an executable named "codegen" that will generate Ada files with the extension ".ads".

For proper handling of dependencies (which will minimize recompilation), gprbuild needs to know about the name of the generated file. For us this is a file with a ".ads" extension (although we could of course generate additional files). When multiple XML files are needed for a single Ada file, the code generator should generate a dependency file (basically similar to a Makefile extract) that indicates the list of those files. In our case, we decided that this dependency file name will be passed to the code generator via the -M switch.

Finally, we can decide that some switches are mandatory for the code generator, and we show an example with the -g switch.

Sample code for a very simple code generator is shown below:

with Ada.Text_IO; use Ada.Text_IO;
with GNAT.Command_Line; use GNAT.Command_Line;

procedure Codegen is
   F : File_Type;
begin
   loop
      case Getopt ("g M: o:") is
         when 'g' => -- some random switch passed to the compiler
            null;
         when 'M' => -- We need to generate the dependency file.
            Create (F, Out_File, Parameter);
            Put_Line (F, "b.ads: ../src/b.xml");
            Close (F);
         when 'o' => -- Name of the object file
            -- We need to parse the XML file and generate code.
            -- Let's simulate it.
            Create (F, Out_File, Parameter);
            Put_Line (F, "package B is");
            Put_Line (F, "   procedure Foo is null;");
            Put_Line (F, "end B;");
            Close (F);
         when others =>
            exit;
      end case;
   end loop;
end Codegen;

The user project

The hard part is now done, and we can move on to writing a project that uses this code generator. Since we provided that information in a general manner to gprbuild, we can have multiple such projects without duplicating the work above.

The setup is the following:

      gprconfig_db/

          Contains the XML file we created above for gprbuild

      src/

          This directory should contain the .xml files used for code generation, as well as hand-coded Ada files. Let's assume it contains b.xml and a.adb

      generated/

            This directory will contain the Ada files generated from the XML files.

      obj/

            This directory will contain the object files resulting from the compilation of the Ada files.

Gprbuild needs to know all of its sources when it starts, so in practice we will need to make two runs of gprbuild: one to generate the Ada from XML and the second to compile all the Ada files and link the executable. That means that the set of source files is different in the two steps: in the first step, the sources are XML files, and the "object files" are the resulting Ada files; in the second step, the sources are the Ada files, and the object files are the usual .o files.

We could implement this setup with two different project files, one for each step. However, my own preferred approach is to use a single project file with a scenario variable that indicates the current step.

Here is the project file:

project Default is

   type Compilation_Step is ("Step_1", "Step_2");

   Step : Compilation_Step := External ("STEP", "Step_1");

   case Step is
      when "Step_1" =>
         for Languages use ("xml_for_ada");
         for Source_Dirs use ("src");
         for Object_Dir use "generated";
      when "Step_2" =>
         for Languages use ("Ada");
         for Main use ("a.adb");
         for Source_Dirs use ("src", "generated");
         for Object_Dir use "obj";
   end case;

end Default;

After all this setup, the compilation itself is done with these two simple commands:

   gprbuild --db gprconfig_db -Pdefault -XSTEP=Step_1
   gprbuild -Pdefault -XSTEP=Step_2

The first time they are run, the output is:

   > gprbuild --db gprconfig_db -Pdefault -XSTEP=Step_1
   codegen -g -Mb.d b.xml -o b.ads

   > gprbuild -Pdefault -XSTEP=Step_2
   gcc -c a.adb
   gcc -c b.ads
   gprbind a.bexch
   gnatbind a.ali
   gcc -c b__a.adb
   gcc a.o -o a

The second time (if no file was modified), we get the expected:

   > gprbuild -Pdefault -XSTEP=Step_2
   gprbuild: "a" up to date

If we now modify b.xml, and b.ads is regenerated and then recompiled, as in the first step.

One more trick: we could in fact describe a third step (Step_3), that would be the default value for the external variable STEP. In that third step, the project would contain the languages for both xml_for_ada and Ada. The use of that artificial third step would be to load that project directly in GPS, conveniently allowing editing of both XML and Ada files.

Emmanuel Briot
AdaCore

Emmanuel Briot has been with AdaCore since 1998. He has been involved in a variety of projects, in particular oriented towards graphical user interfaces, including GtkAda, GPS, XML/Ada, GnatTracker and our internal CRM. He holds an engineering degree from the Ecole Nationale des Telecommunications (Brest, France).

Gem #156: Listing Control in GNAT

Let's get started...

The default output from the compiler just includes the error messages, along with any warnings that are enabled by default.

For example:

	f.adb:3:04: warning: "return" statement missing following this statement
        f.adb:3:04: warning: Program_Error may be raised at run time
        f.adb:4:14: warning: value not in range of type "Standard.Natural"
        f.adb:4:14: warning: "Constraint_Error" will be raised at run time
        f.adb:6:16: division by zero
        f.adb:6:16: static expression fails Constraint_Check

These messages show the exact location of messages, and if you edit the file you can find out exactly where each message is issued. But there are many switches that can be used to modify the output. To see better where each message is issued, without generating too much output, you can use -gnatv:

     3.    if A > B then
           |
        >>> warning: "return" statement missing following this statement
        >>> warning: Program_Error may be raised at run time
     4.       return -1;
                     |
        >>> warning: value not in range of type "Standard.Natural"
        >>> warning: "Constraint_Error" will be raised at run time
     6.       return 5 / 0;
                       |
        >>> division by zero
        >>> static expression fails Constraint_Check

And if you use -gnatl, you can get a full listing with line numbers and all the messages:

Compiling: f.adb (source file time stamp: 2013-12-28 18:26:22)

     1. function F (A, B : Natural) return Natural is
     2. begin
     3.    if A > B then
           |
        >>> warning: "return" statement missing following this statement
        >>> warning: Program_Error may be raised at run time
     4.       return -1;
                     |
        >>> warning: value not in range of type "Standard.Natural"
        >>> warning: "Constraint_Error" will be raised at run time
     5.    elsif B = 0 then
     6.       return 5 / 0;
                       |
        >>> division by zero
        >>> static expression fails Constraint_Check
     7.    end if;
     8. end F;

Note that in the above output, the source-file time stamp may be annoying if, for example, you are filing regression test output, but it can be suppressed using the switch -gnatd7. Also -gnatl takes an optional parameter (e.g., -gnatl=f.lst) that allows this output to be written to a designated file.

In the above output, we have messages that extend over two lines. The switch -gnatjnn, where nn is a decimal integer, provides a nice way of outputting such messages. The nn value is the maximum line length, so, for example, if we would like to limit the output message length to 68 characters, we can use the switches -gnatl and -gnatj68:

     1. function F (A, B : Natural) return Natural is
     2. begin
     3.    if A > B then
           |
        >>> warning: "return" statement missing following this
            statement, Program_Error may be raised at run time
     4.       return -1;
                     |
        >>> warning: value not in range of type "Standard.Natural",
            "Constraint_Error" will be raised at run time
     5.    elsif B = 0 then
     6.       return 5 / 0;
                       |
        >>> division by zero, static expression fails
            Constraint_Check
     7.    end if;
     8. end f;

The -gnatj switch is a pretty recent addition, and many people are not aware of it, but it is definitely nice in many situations.

In addition to basic source output control, there are various auxiliary outputs that are useful. Of particular interest is -gnatR, causing the compiler to print representation information, including sizes and alignments, which can be very useful for diagnosing problems in interfacing to external systems and hardware.

Robert Dewar
AdaCore

Dr. Robert Dewar is co-founder, President and CEO of AdaCore and Emeritus Professor of Computer Science at New York University. With a focus on programming language design and implementation, Dr. Dewar has been a major contributor to Ada throughout its evolution and is a principal architect of AdaCore’s GNAT Ada technology. He has co-authored compilers for SPITBOL (SNOBOL), Realia COBOL for the PC (now marketed by Computer Associates), and Alsys Ada, and has also written several real-time operating systems, for Honeywell Inc. Dr. Dewar has delivered papers and presentations on programming language issues and safety certification and, as an expert on computers and the law, he is frequently invited to conferences to speak on Open Source software, licensing issues, and related topics.

Gem #155: Enhancing the GPRBuild Database for a New Language

Let's get started...

In Gem #152, we described how to define a new language inside a project file. That's useful when you are using the language very seldom and when you only have one project file for the language. But when you are using the language often and you have several projects using this language, it becomes rather cumbersome.

Fortunately, there's a way to indicate to gprbuild all the characteristics of a language without having to repeat them in every project that uses the language. In fact, that's the technique used to support Ada, C, C++, and Fortran natively in gprbuild.

gprconfig XML database

As we discussed in the earlier Gem, gprbuild reads the definition of the language and its support tools from the project file. In fact, it also uses a second (most often implicit) file, called the configuration project file, or simply the configuration file.

The configuration file uses the same syntax as the project files themselves, and is merged with all of the project files that are parsed by gprbuild. The goal here is thus to alter the contents of the configuration file, so as to add the information for the new language there, instead of in every project in the project tree.

When gprbuild is invoked without a specified configuration project file (indicated by switches --config= or --autoconf=), and there is no default configuration project file (default.cgpr), gprbuild invokes gprconfig to create a configuration project file, and then uses the newly created file. This is called auto-configuration.

To build this configuration project file, gprconfig uses an XML database. By default the XML files used are located in directory <prefix>/share/gprconfig, where you will find files such as compilers.xml, c.xml, and gnat.xml.

It's also possible to indicate that gprconfig should take into account other XML files in another directory. This is done through the --db switch:

--db dir  (Parse dir as an additional knowledge base)

The format of these XML files is described in the GPRbuild User's Guide, in 2.2.6 The GPRconfig knowledge base.

Creating the XML file

So, to describe the characteristics of language "New_Lang", we will create an XML file new_lang.xml in a directory "db".

An XML file in a gprconfig database must start with:

        <?xml version="1.0" ?>
        <gprconfig>

and end with:

        </gprconfig>

For each compilable language, there must be a <compiler_description> tag and a <configuration> tag.

Naming the compiler

The first part of work done by gprconfig is to determine which compilers are installed. To do this, it uses the information from the <compiler_description> nodes in any of the XML files. These nodes need to explain how to locate the compiler executable anywhere on the PATH, then how to extract information from it such as its version and the list of supported languages.

The <compiler_description> tag includes several child tags. Some are compulsary: <name>, <executable>, <version>, and <languages>.

In our XML file newlang.xml, the compiler_description tag will be:

        <compiler_description>
          <name>NEW_LANG</name>
          <executable>nlang</executable>
          <version>1.0</version>
          <languages>New_Lang</languages>
        </compiler_description>

indicating that it is describing the NEW_LANG compiler for language "New_Lang", that the compiler driver is "nlang", and that it has version 1.0.

In general, the version number is not hard-coded like it is here, but is the result of running the executable with a special switch, and then parsing the output by using regular expressions. See the file compilers.xml in gprbuild distribution for some examples.

Describing the characteristics of the language

After this first pass, gprconfig has a full list of all the compilers available on the system. Based on the needs of the projects (in particular which programming languages are used and the target), it will try to find a set of compatible compilers, and use those to generate the configuration file.

Full control is given over the attributes that need to be added to the configuration file, using the <configuration> node in the XML files.

The <configuration> tag needs two child tags: <compilers> and <config>.

The <compilers> tag indicates the different languages/compilers/versions this configuration applies to. Here, we only need to indicate that the name is "NEW_LANG".

	<compilers>
          <compiler name="NEW_LANG">
        </compiler>

The <config> tag indicates the chunks that need to be included in the configuration project file. Here we only need to have packages Naming and Compiler.

If there are several "chunks" in different <configuration> nodes for the same package, gprconfig automatically merges these chunks in the package in the generated configuration file.

So, our XML file new_lang.xml will contain:

	<?xml version="1.0" ?>
        <gprconfig>
          <compiler_description>
            <name>NEW_LANG</name>
            <executable>nlang</executable>
            <version>1.0</version>
            <languages>New_Lang</languages>
          </compiler_description>
	  <configuration>
            <compilers>
              <compiler name="NEW_LANG">
            </compiler>
            <config>
        package Naming is
           for Body_Suffix ("New_Lang") use ".nlng";
        end Naming;
	package Compiler is
           for Driver ("New_Lang") use "nlang";
           for Leading_Required_Switches ("New_Lang") use ("-c");
           for Dependency_Kind ("New_Lang") use "Makefile";
           for Dependency_Switches ("New_Lang") use ("--dependencies=");
           for Include_Switches ("New_Lang") use ("-I", "");
        end Compiler;
            </config>
          </configuration>
        </gprconfig>

Using gprbuild auto-configuration for the new language

We need to make sure that the compiler "nlang" is on the path.

If we have a project file prj.gpr in the current working directory that contains:

	project Prj is
           for Languages use ("New_Lang");
        end Prj;

and a New_Lang source foo.nlng, then invoking:

	gprbuild prj.gpr --db db

will invoke the compiler for foo.nlng:

	$ gprbuild prj.gpr --db db
	nlang -c foo.nlng
	$

We now have a way to compile sources of our language New_Lang without the need to repeat all the characteristics of the language in each of the project files with New_Lang sources. However, we still need to invoke gprbuild with the switch --db. Could we do better?

Incorporating XML files in the default gprconfig database

Yes, we can! If we simply copy our XML file new_lang.xml into the default gprconfig XML database <prefix>/share/gprconfig, the language New_Lang will be taken into account automatically by gprconfig and we will no longer need to invoke gprbuild (or gprconfig) with the switch --db:

	$ gprbuild prj.gpr
        nlang -c foo.nlng
        $

Summary

We have described a way to have a new language auto-configured in gprbuild, through an XML file.

Of course, this example is very simple. The full documentation of the gprconfig XML files is in the GPRbuild User's Guide in section (2.2.6 The GPRconfig knowledge base) and its subsections.

So, if you are interested in describing your new languages through XML files, we encourage you to read it, and to study the different XML files in <prefix>/share/gprconfig.

 

Vincent Celier
AdaCore

Vincent Celier spent twenty years in the French Navy, as a radar and computer officer. He retired in 1988 with the rank of commander and joined CR2A, a software house, where he was one of the authors of the ISO Technical Report ExtrA (Extensions Temps Reel en Ada). In 1994 he emigrated to Vancouver, Canada, to work on CAATS, the Canadian Automated Air Traffic System, a large system written in Ada. He joined AdaCore in 2000. He is the main implementer of the Project Manager and of gprbuild, the multi-language builder.

Gem #154: Multicore Maze Solving, Part 2

Let’s get started...

 

This series of Gems describes the concurrent maze solver project ("amazing") included with the GNAT Pro examples. The first Gem in the series introduced the project itself and explained the concurrent programming design approach. This second Gem explores the principal change that was required for optimal performance on multicore architectures. This change solved a critical performance bottleneck that was not present when the original program was first deployed in the 1980s, illustrating one of the fundamental differences between traditional multiprocessing and modern multicore programming.

The original target machine was a Sequent Balance 8000, a symmetric multiprocessor with eight CPUs and shared memory. The operating system transparently dispatched Ada tasks to processors, so one could write a highly portable concurrent Ada program for it. In the 1980s this was a very attractive machine, as you might imagine. The resulting program successfully demonstrated Ada's capability to harness such architectures, as well as the general benefits of parallel execution. In particular, the execution time for the sequential version of the maze solver grew at an alarming rate as the number of maze solutions grew larger, whereas the parallel version showed only modest increases. (Remember, the point is to find all the possible solutions to a given maze, not just one.)

The program was indeed highly portable and ran on a number of very different vendors' machines, some parallel and some not. Over time, we have incorporated the language revisions' advances, primarily protected types, and added features such as command-line switches for flexibility, but the architecture and implementation have largely remained unchanged. Until recently, that is.

As described in the first Gem in this series, the program "floods" the maze with searcher tasks in a classic divide-and-conquer design, each searcher looking for the exit from a given starting point. The very first searcher starts at the maze entrance, of course, but as any searcher task encounters intersections in the maze, it assigns another identical task to each alternative location, keeping one for itself. Thus, a searcher task that finds the exit has discovered only part of a complete solution path through the maze. If the very first searcher happened to find the exit, it would have a complete solution, but all the other searchers have only a part of any given solution path because they did not start at the entrance.

As the searchers traverse the maze they keep track of the maze locations they visit so that those locations can be displayed if the exit is eventually found. But as we have seen, those locations comprise only a partial path through the maze. Therefore, when a successful searcher displays the entire solution it must also know the locations of the solution prior to its own starting point, as well as the locations it traversed itself to reach the exit. To address that requirement, when a searcher is initiated at a given starting location it is also given the current solution as it is known up to that location. The very first searcher is simply given an empty solution, known as a "trail" in the program. Successful searchers display both the part they discovered and the part they were given when started.

Note that these partial solutions are potentially shared, depending on the maze. (Solutions are unique if any constituent maze locations are different, but that does not preclude partial sharing.) Those maze locations closer to the entrance are likely to be heavily shared among a large number of unique solutions. Conceptually, the complete solutions form a tree of location sequences, with prior shared segments appearing earlier in the tree and unique subsegments appearing beneath them. The maze entrance appears once, in the root at the top of the tree, whereas the maze exit appears at the end of every solution.

Imagine, then, how one might want to represent this tree. Given that segments of the solutions – the trails – are likely shared logically, perhaps we can also share them physically. However, as a shared data structure, race conditions are an obvious concern. We therefore want a representation that will minimize the locking required for mutual exclusion. We also want a representation that can contain any number of location pairs per segment because the mazes are randomly generated initially. That is, we don't know how many locations any given solution will contain, much less how many solutions there will be.

An unbounded, dynamically allocated list of maze locations meets these goals nicely. It can directly represent the logical sharing and can handle trails of any length as long as sufficient memory is available. Even better, no mutual exclusion locking is required because we only need to append list segments to prior, existing segments. There is no need to alter the prior segments themselves, so there is no need to lock the tree at all!

The representation seems ideal, and for the original symmetric multiprocessor target it was a reasonable approach, but when the program was run on modern multicore machines the performance was very poor. Indeed, individual processor utilization was so poor that the sequential version of the maze solver was quite competitive with the concurrent version.

Poor processor utilization is the key to the problem. Even though we are harnessing multiple processors and can have as many threads available per processor as we may want, individual processors are performing poorly. The problem is caused by poor cache utilization, itself a result of poor locality of reference. Specifically, the dynamically allocated elements within the trails are not in memory locations sufficiently close to one another to be in the same cache line, thereby causing many cache misses and poor overall processor performance.

The issue is that searcher tasks must also examine the locations within their prior solution trails as they search for the exit. (In other words, not only when displaying solutions.) They do so to prevent false circular solutions through the maze, made possible by the presence of intersections. Therefore, the searcher tasks must determine whether they have been to a given location in the maze before including that location in their solution. Not all location pairs in a trail need be visited, however, to perform this check. The presence of an intersection in a prior path segment suffices to indicate circular motion, so each trail includes a list of intersections, and it is this secondary list that the searchers examine. Unfortunately any benefits of that implementation optimization are overwhelmed by the results of the cache misses.

A different trail representation is needed for programs intended for multicore targets, one with much better locality of reference. Arrays have that precise characteristic, so we have chosen a bounded, array-backed list to represent trails. That choice will not surprise those familiar with this problem, even though the resulting copying and lack of physical sharing would argue against it.

In the next Gem in this series we will provide the details of this implementation change and the reusable components involved.

As mentioned, the "amazing" project is supplied with the GNAT Pro native compiler. Look for it in the share/examples/gnat/amazing/ directory located under your compiler’s root installation. Note that the described design change will appear in future releases of the compiler.

 

Pat Rogers
AdaCore

Pat Rogers has been a computing professional since 1975, primarily working on microprocessor-based real-time applications in Ada, C, C++ and other languages, including high-fidelity flight simulators and Supervisory Control and Data Acquisition (SCADA) systems controlling hazardous materials. Having first learned Ada in 1980, he was director of the Ada9X Laboratory for the U.S. Air Force’s Joint Advanced Strike Technology Program, Principle Investigator in distributed systems and fault tolerance research projects using Ada for the U.S. Air Force and Army, and Associate Director for Research at the NASA Software Engineering Research Center. He has B.S. and M.S. degrees in computer systems design and computer science from the University of Houston and a Ph.D. in computer science from the University of York, England. As a member of the Senior Technical Staff at AdaCore, he specializes in supporting real-time/embedded systems developers, creates and provides training courses, and is project leader and a developer of the GNATbench Eclipse plug-in for Ada. He also has a 3rd Dan black belt in Tae Kwon Do and is founder of the AdaCore club “The Wicked Uncles”.

Gem #153: Multicore Maze Solving, Part 1

Let’s get started...

This Gem series introduces the "amazing" project included with the GNAT Pro compiler examples.  The project is so named because it involves maze solving (as in "Joe and Julie go a-mazing").  But these aren’t typical mazes that have only one solution.  These mazes can have many solutions, tens of thousands, for example.  The point is to find all of them as quickly as possible. Therefore, we solve the mazes concurrently, applying multiple CPUs in a divide-and-conquer design. In this first Gem we introduce the program and explain the approach.

We actually have two programs for maze solving: one sequential and one concurrent. Based on the notion of a mouse solving a maze, the sequential program is named mouse and the concurrent version is named – you guessed it – mice. Both are invoked from the command line with required and optional switches. The available switches vary somewhat between the two, but in both cases you can either generate and solve a new maze or re-solve a maze previously generated.

When generating a new maze you have the option to make it "perfect", that is, to have only one exit. Otherwise the maze will have an unknown number of solutions. For our purposes we use mazes that are not perfect, and in fact the number of solutions depends solely on the size of the mazes and the random way in which they are generated.

Switches "-h" and "-w" allow you to specify the height and width of a new maze, but other than that their layout is randomly determined. In addition, the concurrent program allows you to specify the total number of tasks available for solving the maze using the "-t" switch. This switch is useful for experimentation, for example in determining the effect of having a great many tasks, or in determining the optimal number of tasks relative to the number of processors available. There are four tasks available by default. The concurrent program will run on as many or as few processors as are available.

Finally, you can control whether a maze and its solutions are displayed. At first glance this might seem a strange option, but displaying them makes the program heavily I/O-bound and serialized, hiding the benefits of parallelism and making it difficult to determine the effects of design changes. Disabling the display is achieved via the "-q" switch.

After either program solves a new maze, you are asked whether you want to keep it. If so, you specify the file name and the program then writes it out as a stream. To re-solve an existing maze you specify both the "-f" switch and the file’s name.

As the programs execute they display the maze, the unique solutions through the maze, and the running total of the number of solutions discovered (unless the "-q" switch is applied). Currently there are two kinds of "console" supported for depicting this information. The selection is determined when building the executables, under the control of a scenario variable having possible values "Win32" and "ANSI".( Terminals supporting ANSI escape sequences are common on Linux systems, so there is a wide range of supported machines.

Now that you know what the programs can do, let’s see how they do it.

The sequential mouse program illustrates the fundamental approach. As it traverses the maze, it detects junctions in the path where more than one way forward is possible. There may be three ways forward, in fact, but not four because that would involve going back over the location just visited. Hence, at any junction the mouse saves all but one of the other alternative locations, along with the potential solution as it is currently known, and then pursues that one remaining lead. Whenever the mouse can go no further – either because it has encountered a dead end or because it has found the maze exit – it restores one of the previously saved alternative location/solution pairs and proceeds from there. The program is finished when the mouse can go no further and no previous alternatives are stored.

The mice program uses the same basic approach, except it does it concurrently. A "searcher" task type implements the sequential mouse behavior, but instead of storing the alternatives at junctions, it assigns a new searcher task to each of the alternatives. These new searchers continue concurrently (or in parallel) with the searcher that assigned them, themselves assigning new searchers at any junctions they encounter. Only when no additional searcher task is available does any given searcher store alternative leads for later pursuit. If it does restore a lead, it uses the same approach at any further junctions encountered.

A new searcher task may be unavailable when requested, because we use a pool of searcher instances, with a capacity controlled by the command-line parameter. When no further progress is possible, a searcher task puts itself back into this pool for later assignment, so additional searchers may be available when restored leads are pursued. The main program waits for all the searchers to be quiescent, waiting in the pool for assignments, before finishing.

The body for the Searcher task type (declared in package Search_Team) implementing this behavior follows:

 

   task body Searcher is
      Path             : Traversal.Trail;
      The_Maze         : constant Maze.Reference := Maze.Instance;
      Current_Position : Maze.Position;
      Myself           : Volunteer;
      Unsearched       : Search_Leads.Repository;
   begin
      loop
         select
            accept Start (My_ID : Volunteer;
                          Start : Maze.Position;
                          Track : Traversal.Trail)
            do
               Myself := My_ID;
               Current_Position := Start;
               Path := Track;
            end Start;
         or
            terminate;
         end select;

         Searching : loop
            Pursue_Lead (Current_Position, Path, Unsearched);

            if The_Maze.At_Exit (Current_Position) then
               Traversal.Threaded_Display.Show (Path, On => The_Maze);
            end if;

            exit Searching when Unsearched.Empty;

            --  Go back to a position encountered earlier that
            --  could not be delegated at the time.
            Unsearched.Restore (Current_Position, Path);
         end loop Searching;

         Pool.Return_Member (Myself);
      end loop;
   end Searcher; 

The Searcher task first suspends, awaiting either initiation, to start pursuing a lead, or termination. The rendezvous thus provides the initial location and the currently known solution path. The parameter My_Id is a reference to that same task and is used by the task to return itself back into the pool, when searching is finished. The accept body simply copies these parameters to the local variables. The other local variables include a reference to the maze itself (we use a singleton for that) and the repository of unsearched leads, used to store position/solution pairs for future pursuit.

As the task searches for the exit, procedure Pursue_Lead delegates new searcher tasks to alternatives when junctions are encountered. The procedure returns when no further progress can be made on a given lead. In effect we "flood" the maze with searcher tasks, so this is a divide-and-conquer design typical of classical concurrent programming.

In the next Gem in this series, we will describe a fundamental implementation change made very recently (September 2013) to the original concurrent program. This change solved a critical performance bottleneck that was not present when the original program was first deployed in the 1980s, illustrating one of the fundamental differences between traditional multiprocessing and modern multicore programming.

As mentioned, the "amazing" project is supplied with the GNAT Pro native compiler. Look for it in the share/examples/gnat/amazing directory located under your compiler’s root installation. Note that the design change will appear in future releases of the compiler.

Pat Rogers
AdaCore

Pat Rogers has been a computing professional since 1975, primarily working on microprocessor-based real-time applications in Ada, C, C++ and other languages, including high-fidelity flight simulators and Supervisory Control and Data Acquisition (SCADA) systems controlling hazardous materials. Having first learned Ada in 1980, he was director of the Ada9X Laboratory for the U.S. Air Force’s Joint Advanced Strike Technology Program, Principle Investigator in distributed systems and fault tolerance research projects using Ada for the U.S. Air Force and Army, and Associate Director for Research at the NASA Software Engineering Research Center. He has B.S. and M.S. degrees in computer systems design and computer science from the University of Houston and a Ph.D. in computer science from the University of York, England. As a member of the Senior Technical Staff at AdaCore, he specializes in supporting real-time/embedded systems developers, creates and provides training courses, and is project leader and a developer of the GNATbench Eclipse plug-in for Ada. He also has a 3rd Dan black belt in Tae Kwon Do and is founder of the AdaCore club “The Wicked Uncles”.