Microarchitecture rpm macros

During Hackweek 20 at SUSE I created some rpm macros to create packages easily that use the glibc-hwcaps feature. There’s a post with the journal from the hackweek in case you want to read it. Here I’ll just explain how to use the new macros I created.

The package definition

First you have to add a BuildRequires to use the macros:

BuildRequires:  microarch-rpm-macros

Then before the %description section, you have to add a line like:

%microarch_subpackage -n %{libname}

The %microarch_subpackage macro is used to generate the subpackage sections. It’s important that the parameter passed to it is the same as the parameter passed to the %package section that defines the library package. It’ll also generate the %files section with the same contents as the %files section in the library package but with the directory adapted to the microarchitecture of each subpackage.

The %build section

Let’s consider the following code in the build section:

autoreconf -fiv
%configure \
   --with-pic \
   --disable-static
make %{?_smp_mflags} CFLAGS="%{optflags}"

We will replace that with:

autoreconf -fiv
%{microarch_build %$configure \
  --with-pic \
  --disable-static
  make %{?_smp_mflags} CFLAGS="%{$optflags}"
}

The %microarch_build macro will take care of executing 4 times the code within to build the baseline version of the package and then the x86-64-v2, x86-64-v3 and x86-64-v4 versions, each in a different directory and with different %optflags values which include the respective -march and -mtune parameters in each case as well as different %_libdir values so the library is installed to the right place later in %install.

Note that %configure was replaced with %$configure and %{optflags} was replaced with %{$optflags}. This is done so that they’re not expanded before passing the arguments to %microarch_build .

Also note that the autoreconf execution was left out of the macro. This is so that the configure script is generated in the root source directory. Then the %microarch_build macro can generate build.x86-64-vN directories and put there the build files.

In my test with the bzip2 package I had a special case. If the %do_profiling boolean is set then code is built, the tests are executed and then the code is built again with the generated profiling information. The %build section used was:

%configure \
  --with-pic \
  --disable-static
%if 0%{?do_profiling}
  make %{?_smp_mflags} CFLAGS="%{optflags} %{cflags_profile_generate}"
  make %{?_smp_mflags} CFLAGS="%{optflags} %{cflags_profile_generate}" test
  make %{?_smp_mflags} clean
  make %{?_smp_mflags} CFLAGS="%{optflags} %{cflags_profile_feedback}"
%else
  make %{?_smp_mflags} CFLAGS="%{optflags}"
%endif

And I replaced that with:

%{microarch_build %$configure \
  --with-pic \
  --disable-static
%if 0%{?do_profiling} && "%{$microarch_current_flavor}" == "x86-64"
  make %{?_smp_mflags} CFLAGS="%{$optflags} %{cflags_profile_generate}"
  make %{?_smp_mflags} CFLAGS="%{$optflags} %{cflags_profile_generate}" test
  make %{?_smp_mflags} clean
  make %{?_smp_mflags} CFLAGS="%{$optflags} %{cflags_profile_feedback}"
%else
  make %{?_smp_mflags} CFLAGS="%{$optflags}"
%endif
}

Note that here again I used a $ within %{$microarch_current_flavor} so it can be replaced in each flavor with the right value.

The %install section

%install sections usually consist on running something like %make_install and then maybe installing some files manually. In this case, we would replace this:

%make_install pkgconfigdir=%{_libdir}/pkgconfig
install -Dpm 0755 foo %{buildroot}%{_bindir}/foo
install -m 0644 %{SOURCE2} %{buildroot}%{_mandir}/man1

with:

%microarch_install %make_install pkgconfigdir=%{_libdir}/pkgconfig
install -Dpm 0755 foo %{buildroot}%{_bindir}/foo
install -m 0644 %{SOURCE2} %{buildroot}%{_mandir}/man1

%microarch_install will run the argument passed to it 4 times but first it’ll run the microarch flavors and the baseline flavor will be run last.

Note that after the flavors are installed and before the baseline installation is done, it’ll remove all *.so files within the glibc-hwcaps directories since we don’t want development files in there.

The %check section

In the %check section packages usually run the generated binaries to test they work as expected. Note that we can’t do that for all flavors since we may not have a recent enough CPU to run them. Because of this, I opted to just check the baseline flavor.

Just replace

make %{?_smp_mflags} test

or anything you have to run from the build directory in %check with:

pushd %microarch_baseline_builddir
make %{?_smp_mflags} test
popd

And that should be enough for the simple case of bzip2 and similar packages.

Please note that this is work in progress and currently using the macros with %cmake or %meson will fail and is not supported yet. Check the conclusions on the previous post for information about what’s still missing.

Hackweek 20: glibc-hwcaps in openSUSE

This week we’ve held Hackweek 20 in SUSE so I’ll try to explain here what I’ve worked on. I recently noticed glibc 2.33 introduced hwcaps support which means it’s now possible to install libraries using an expanded cpu instruction set from recent CPUs in addition to the regularly compiled libraries and glibc will automatically choose the version optimized for the current cpu in use. This sounded very nice so I thought I’d try to work on that for my hackweek project.

My plan was to work at the package building level: Add/modify rpm macros to make it easy to build packages so that subpackages optimized for the different microarchitectures are (semi-)automatically generated and SUSE/openSUSE users can easily install those packages with optimizations for the specific cpu in use.

The preliminary tests

I began by creating a home:alarrosa:branches:hackweek:glibc-hwcaps project in obs to force gcc-11 to be used by default to build every package I wanted to test and then added a home:alarrosa:branches:hackweek:glibc-hwcaps:baseline subproject where I’d build baseline versions of packages and a home:alarrosa:branches:hackweek:glibc-hwcaps:x86-64-v3 project where I’d build the packages using `-march=x86-64-v3 -mtune=skylake` so they’re optimized for my cpu and I can measure the speed improvement.

I first thought I’d benchmark converting an x264 file to x265 using ffmpeg, so I built fdk-aac, libx264, x265 and ffmpeg-4 in both projects (baseline and x86-64-v3). The results were practically the same with both versions but that was partly expected since ffmpeg and most video libraries usually already contain code to check the current cpu and run code specifically optimized for it in assembly.

So I thought I should try a C/C++ library that’s not video-related, which brought me to building baseline and x86-64-v3 versions of libpng16, poppler, cairo and freetype2 libraries.

I then executed the following command to render png files for each page of a large pdf file using both sets of libraries:

time pdftocairo asio.pdf -png

The results were:

  • 325.618 seconds (mean over 3 runs with 1.235 seconds of difference between the min and max results) for the baseline version.
  • 336.672 seconds (mean over 4 runs with 0.664 seconds of difference between the min and max results) for the x86-64-v3 version

Yes, you read that right. Unexpectedly, the optimized version was noticeably slower. I got a bit frustrated with that result but still thought that it might be related to problems with the current version of the compiler that might be fixed in the future, so it might be worth to continue working on the project.

A quick test for glibc-hwcaps

I created a really small libbar dynamic library with a function that prints a message on the screen, built it three times with three different messages and put each of them into /usr/lib64, /usr/lib64/glibc-hwcaps/x86-64-v2 and /usr/lib64/glibc-hwcaps/x86-64-v3 . I then did a small foo binary that linked to libbar and called that function. Making only some of the libraries available worked as expected so I confirmed that glibc-hwcaps support worked as expected.

The microarch rpm macros

At this point (it was already wednesday afternoon), I could start working on the rpm macros. In order to test them, I created yet another project at home:alarrosa:branches:hackweek:glibc-hwcaps:test . In there I created a new package microarch-rpm-macros that would install … well… the rpm macros 🙂 and then another package called microarch that would be used on one hand to generate a microarch-filesystem package that owns the new directories /usr/lib64/glibc-hwcaps and /usr/lib64/glibc-hwcaps/x86-64-v[234] and 3 other packages (microarch-x86-64-v2, microarch-x86-64-v3 and microarch-x86-64-v4) that you’ll see in a moment what they’re used for.

I worked on the rpm macros and these packages on Thursday and Friday and by 19:00 on Friday I got everything working.

I’ll explain the rpm macros I created on my next post so that it can be used as a reference without having all the explanations in this post about the story to develop them.

The rpm macros built the package four times with different optimization flags, generated all three subpackages with the optimized versions, put the library files in place and then adding the repository from the test obs project I could do:

sudo zypper in microarch-x86-64-v3
Loading repository data…
Reading installed packages…
Resolving package dependencies…

The following 2 NEW packages are going to be installed:
 libbz2-1-x86-64-v3 microarch-x86-64-v3

2 new packages to install.
Overall download size: 52.6 KiB. Already cached: 0 B. After the operation, additional 74.4 KiB will be used.
Continue? [y/n/v/…? shows all options] (y):

So just installing the microarch-x86-64-v3 package pulls in all optimized packages for that microarchitecture automatically.

Conclusions

I consider the hackweek project was a partial success. I did what I wanted in the original plan and it works well. There’s still work to do of course:

  • The rpm macros need to be polished (a lot) before submitting them to Factory.
  • More packages apart from bzip2 should be adapted to use them.
  • The macros will need to be adapted to more use cases. For example, using cmake or meson to build a package with the %microarch* macros is not tested and I have no doubts it’ll fail. Fortunately, now that the main work is done I think this will be easy to implement.
  • I need to provide NOOP versions of the macros for other architectures since currently they just fail to build packages on anything different than x86-64 (does glibc-hwcaps support microarchitectures for other architectures?)

And then, even if I work on all points above there’s still the main issue of the optimized libraries being slower than the baseline ones. In any case, once this issue is solved, all this should bring some benefits to our distributions. And the project was also useful to have a confirmation that using optimization flags doesn’t always means that the generated code will be faster.

Before ending I’d like to thank Florian Weimer, Michael Matz, Dario Faggioli, Albert Astals and Dan ÄŒermák for their valuable input on this project as well as MatÄ›j Cepl, Ben Greiner and the rest of the authors of the great openSUSE python singlespec macros which are the inspiration of this project.

The Browser Tutorial

Last week I was invited to give a talk at the first edition of the “Jornadas Tecnológicas Insert Coín” in Coín (Málaga, Spain) to the students of the high school I.E.S. Los Montecillos and other high schools of the area. The subject of the talk was cross-platform development with KDE and Qt. I think the talk was very well received and the students seemed to like it (at least, I think they did). I organized the talk in two sections, first some slides about Qt and KDE (mostly about KDE Frameworks), and then some “live” development to show how use Qt Creator to develop a small web browser with Qt in C++ and then how to use KDevelop to develop the same application in Python.

After the talk was ended, I told the audience I would put the source code on github so they can examine it with more time, modify it and use it to learn more about Qt. So I have just uploaded it to https://github.com/antlarr/thebrowsertutorial with the hope it’s useful to them, and also to others wanting to learn about Qt. In any case, please, read the README on that repository before cloning the sources in your computer.

Update: I uploaded the talk slides I used (in Spanish) in case they’re useful to anyone: insertcoin.odp

Optimizing a Python application with C++ code

I’ve been working lately in a command line application called Bard which is a music manager for your local music collection. Bard does an acoustic fingerprinting of your songs (using acoustid) and stores all song metadata in a sqlite database. With this, you can do queries and find song duplicates easily even if the songs are not correctly tagged. I’ll talk in another post more about Bard and its features, but here I wanted to talk about the algorithm to find song duplicates and how I optimized it to run around 8000 times faster.

The algorithm

To find out if two songs are similar, you have to compare their acoustic fingerprints. That seems easy (and in fact, it is), but it’s not as straightforward as it seems. A fingerprint (as acoustid calculates it) is not just a number, but an array of numbers, or better said, an array of bits, so you can’t just compare the numbers themselves, but you have to compare the bits in those numbers. If all bits are exactly the same, the songs are considered the same, if 99% of bits are the same then that means there’s a 99% chance it’s the same tune, maybe differing because of encoding issues (like one song being encoded as mp3 with 192 kbits/s and the other with 128 kbits/s).

But there are more things to have in mind when comparing songs. Sometimes they have different silence lengths at the beginning or end, so the bits that we compare are not correctly aligned and they don’t match as they are calculated, but could match if we shifted one of the fingerprints a bit.

This means that to compare two songs, we don’t just have to compare the two fingerprints, but then we have to simulate increasing/decreasing silence lengths at the beginning of a song by shifting its fingerprint and calculate the match level for each shift to see if we improve/worsen the similarity percentage.  Right now, Bard shiftes the array by 100 positions in one direction and then another 100 positions in the other direction, this means that for each song we have to compare the two fingerprints 200 times.

Then, if we want to compare all of the songs in a collection to find duplicates, we need to compare the songs with ID 1 and 2, then we need to compare the song with ID 3 with IDs 1 and 2, and in general, each song with all previous ones. This means for a collection with 1000 songs we need to compare 1000*1001/2 = 500500 songs (or 100100000 fingerprints).

The initial python implementation

Bard is written in python, so my first implementation used a python list to store the array of integers from the fingerprint. For each iteration in which I have to shift the array numbers I prepend a 0 to one fingerprint array and then iterate over them comparing each element. To do this comparison, I xor the two elements and then use a standard good known algorithm to count the bits which are set in a integer:

def count_bits_set(i):
    i = i – ((i >> 1) & 0x55555555)
    i = (i & 0x33333333) + ((i >> 2) & 0x33333333)
    return (((i + (i >> 4) & 0xF0F0F0F) * 0x1010101) & 0xffffffff) >> 24

Let’s use the speed of this implementation as a reference and call it 1x.

First improvement

As a first improvement I tried replacing that bit counting algorithm with gmpy.popcount which was faster and also improved the algorithm by introducing a canceling threshold. This new algorithm stops comparing two fingerprints as soon as it’s mathematically impossible to have a match over the canceling threshold. So for example, if we are iterating the fingerprints and we calculate that even if all remaining bits would match we wouldn’t get at least a 55% match between songs, we just return “different songs” (but we still need to shift songs and try again, just in case).

With these improvements (commit) the comparisons ran at nearly an exact 2x speed.

Enter C++17

At that time,  I thought that this code wouldn’t scale nicely with a large music collection, so I thought Bard needed a much better implementation. Modifying memory is slow and C/C++ allows for much fine grained low-level optimizations, but I didn’t want to rewrite the application in C++ so I used Boost.Python to implement just this algorithm in C++ and call it from the python application. I should say that I found really easy to integrate methods in C++ with Python and I absolutely recommend Boost.Python .

With the new implementation in C++ (commit) I used STL’s vector to store the fingerprints and I added the maximum offsets in advance so I didn’t need to modify the vector elements during the algorithm and I access elements by simulating the shifting offsets. I also use STL’s map to store all fingerprints indexed on a song ID. Finally, I also added another important optimization which is using the CPU’s instructions to calculate bits set by using gcc’s __builtin_popcount.

The best part of this algorithm is that during the comparison algorithm itself, no fingerprint is copied or modified, which translated in a speed of 126.47x . At this point I started calculating another measure: songs compared per second (remember we’re doing 200 fingerprints comparison for each pair of songs). This algorithm gave an average speed of 580 songs/second. Or put another way, to compare our example collection of 1000 songs, this would take 14 min 22 sec (note that we can calculate the original implementation in Python would take approximately 1 day, 6 hours, 16 minutes and 57 seconds).

First try to parallelize the algorithm

I use an i7 CPU to run Bard and I always thought it was a pity that it only used one CPU. Since the algorithm that compares two songs doesn’t modify their data anymore, I thought it might be interesting to parallelize it to allow it to run in all 8 cores at the same time and just coalesce the result of each independent iterations. So I wondered how to do it and noticed that when I compare each song with all previous ones, this is done using a for loop that iterates over a std::map containing all songs that are already processed. Wouldn’t it be nice to have a for-each loop implementation that runs each iteration on a different thread? Well, there is! std::for_each in C++17 allows to specify an ExecutionPolicy with which you can tell it to run the iterations in different threads. Now the bad news: This part of the standard is not completely supported by gcc yet.

So I searched for a for_each implementation and found this stackoverflow question which included one. The problem is that the question mentions the implementation was copied from the C++ Concurrency in Action book and I’m not sure of the license of that code, so I can’t just copy it into Bard. But I can make some tests with it just for measurements purposes.

This increased the speed to 1897x or ~8700 songs/second (1000 songs would be processed in 57 seconds). Impressive, isn’t it? Well… keep reading 🙂

Second parallelization try

So I needed to find a parallelized for_each version I could use. Fortunately I kept looking and found out that gcc includes an experimental parallel implementation of some algorithms in the C++ standard library which includes __gnu_parallel::for_each (there are more parallelized algorithms documented in that page). You just need to link to the openmp library.

So I ran to change the code to use it and hit a problem: I used __gnu_parallel::for_each but every time I tested, it only ran sequentially! It took me a while to find out what was happening, but after reading the gcc implementation of __gnu_parallel::for_each I noticed it required a random access iterator, but I’m iterating on a std::map and maps have a bidirectional iterator, not a random-access one.

So I changed the code (commit) to first copy the fingerprints from the std::map<int, std::vector<int>> to a std::vector<std::pair<int,std::vector<int>>> and with just that, the same __gnu_parallel::for_each line ran using a pool of 8 threads.

The gcc implementation proved to be faster than the implementation in the stackoverflow question, with a speed of 2442x , ~11200 songs/second and 44 seconds.

The obvious but important improvement I forgot about

While looking at the compiler build Bard I noticed I wasn’t using compiler flags to optimize for speed! So I tested adding -Ofast -march=native -mtune=native -funroll-loops to the compiler (commit). Just that. Guess what happened…

The speed raised to a very nice  6552x, ~30050 songs/second and 16 seconds.

The Tumbleweed improvement I got for free

I’m running openSUSE Tumbleweed in the system I use to develop which as you probably know, is a (very nice) rolling release distribution. One day, while doing these tests, Tumbleweed updated the compiler from gcc 7.3 to using gcc 8.1 by default. So I thought that deserved another measurements.

Just changing the compiler to the newer version increased the speed to 7714x, 35380 songs/second and 14 seconds.

The final optimization

An obvious improvement I didn’t do yet was replacing the map with a vector so I don’t have to convert it before each for_each call. Also, vectors allow to reserve space in advance, and since I know the final size the vector will have at the end of the whole algorithm, I changed to code to use reserve wisely.

This commit gave the last increase of speed, to 7998x, 36680 songs/second and would fully process a music collection of 1000 songs in just 13 seconds.

Conclusion

Just some notes I think are important to keep in mind from this experience:

  • Spend some time thinking how to optimize your code. It’s worth it.
  • If you use C++ and can afford to use a modern compiler, use C++17, it allows you to make MUCH better/nicer code, more efficiently. Lambdas, structured bindings, constexpr, etc. are really worth the time spent reading about them.
  • Allow the compiler to do stuff for you. It can optimize your code without any effort from your side.
  • Copy/Move data as little as possible. It’s slow and many times it can be avoided just by thinking a bit on data structures before starting developing.
  • Use threads whenever possible.
  • And probably the most important note: Measure everything. You can’t improve what you can’t measure (well, technically you can, but you won’t know for sure).

How to install the Mycroft A.I. on openSUSE Tumbleweed

Haven’t you ever wanted to have an open source artificial intelligence assistant like some companies provide on their phones/desktops or even at your home but without the privacy concerns? In July last year, during Akademy 2017, I saw a great presentation of Mycroft and the Mycroft plasmoid, done by Aditya Mehra. Mycroft can understand and answer questions the user speaks on the microphone and can also perform actions the user requests. I inmediately knew I wanted that in openSUSE. You can watch the conference in the next video to see what I mean:

Unfortunately, I saw Mycroft had a lot of dependencies and an unorthodox install system, so I didn’t do much with it, but then November came and we had a hackweek at SUSE (in few words, it’s a week SUSE developers can use to work on personal projects). So I started this project to package all Mycroft dependencies along with Mycroft itself and the Mycroft plasmoid as well as to modify Mycroft to integrate in a regular Linux system. Since then, I’ve been using some of my free time to update the packages and the result is that now Mycroft can be easily installed on openSUSE Tumbleweed with a couple of commands and following standard procedures.

I’ll give the installation instructions below, but first of all, let me give some clarifications:

  • Mycroft has many dependencies, including over 50 python packages, many of which are not on Tumbleweed repositories yet (as of 2018-03-06). This is just a matter of time, but in the meantime you’ll have to add a couple of development repositories.
  • If you’re using openSUSE Leap 42.3, then I’m afraid Mycroft can’t be installed there. The good news is that once Leap 15 is released, you’ll want to revisit this blog as there’ll surely be better news by then.
  • Mycroft developers have been nice to accept a patch I sent them to allow it to be installed on standard distribution directories. I think it would be nice if they used those also on their Mycroft platforms, but of course, that’s their call. Also, by default, it seems Mycroft was always thought to run on virtualenvs and that’s not a recommended way to package something for a regular Linux distribution, so the packages are patched to make Mycroft use the system python packages (more on this on the Changes section below which I strongly recommend reading before installing the packages).

Installation instructions

First, you have to add the devel:languages:python repository to zypper, which contains development python packages that haven’t been accepted (yet) into Tumbleweed:

sudo zypper ar -f https://download.opensuse.org/repositories/devel:/languages:/python/openSUSE_Tumbleweed/devel:languages:python.repo

Then, you have to add the repository of the OBS project I use to release usable packages that are for any reason not yet in the official distribution repositories:

sudo zypper ar -f https://download.opensuse.org/repositories/home:/alarrosa:/packages/openSUSE_Tumbleweed/home:alarrosa:packages.repo

Note that both commands above are just one long line each.

Now, you can install the mycroft-core and plasma-mycroft packages, which should pull in all their dependencies:

sudo zypper in mycroft-core plasma-mycroft

It will request you to trust the added repositories keys. On a clean Tumbleweed system the command installs 160 packages and after it finishes, you can add the Mycroft plasmoid to the plasma desktop.

Once installed, you can use the plasmoid to start the Mycroft services, ask something (in the example below, I said on the microphone “Hey Mycroft, what is 2 + 2 ?”) and stop Mycroft.

But before using Mycroft you have to pair it. The first time you start it, it will give you a code composed of 6 alphanumeric characters. Go to home.mycroft.ai, create a free account and register your “device” by entering the code.

And that’s all! You should now be able to use Mycroft on your system and maybe even install new skills. A skill is a module that add a certain capability to Mycroft (for example, if you add the plasma-user-control-skill, Mycroft will understand you when you say “Hey Mycroft, lock the screen” and lock the screen as you requested). Skills can be listed/installed/removed using the plasmoid or the msm commandline application.

In any case, please note this is still work in progress and some features may not work well. Also, I made some changes to mycroft-core code and plasma-mycroft in order to install it in a Linux system and allow it to work without a python virtual environment, so this might break stuff too. Please, don’t blame the Mycroft developers for issues you find with these packages and if you report any issue, I think it’s better to mention it first in the comments section in this post before submitting a bug report to github.com/MycroftAI and bothering them with problems that might not be their fault.

Changes with respect to the upstream version

What did I change with respect to the code provided by Mycroft developers? Well, first, I included some upstream patches to make mycroft-core use python3 and installed it like any other python3 application in /usr/lib/python3.6/site-packages/ . That way we’re also helping the Mycroft developers with the planned upgrade to python3 by testing early.

I also changed the way Mycroft is started so it feels more natural on a Linux desktop. For this, I created some systemd units based on the ones done for ArchLinux. The idea is that there’s a user systemd target called mycroft.target that runs 4 systemd user services when run (mycroft-bus, mycroft-skills, mycroft-audio and mycroft-voice). Of course, it also stops them when the target is stopped. This is all hidden to the user, who can just start/stop Mycroft turning a switch in the plasmoid.

On a regular Mycroft installation, the configuration file is in /etc/mycroft.conf and the skills are installed to /opt/mycroft/skills, but on a regular system a regular user can’t modify those files/directories, so I moved them to ~/.mycroft/mycroft.conf and ~/.mycroft/skills and changed mycroft to prefer those locations. You can have a look at the Mycroft documentation to see what parameters you can set in your mycroft.conf file.

When installing a skill, in a regular Mycroft installation, msm invokes pip to install the required python modules on the virtual environment. Since we’re not using virtual environments I’m just logging the required python modules to ~/.mycroft/mycroft-python-modules.log . So you if you think a skill might be misbehaving or not being properly loaded you should first check that file to see if there’s a missing python module requirement which should be installed in the system using zypper. I plan to make this automatic in the future, but for now, you’ll have to check manually.

You can see all the changes I made by browsing the mycroft-core and plasma-mycroft pages in OBS.

I also added changes to other packages. For example, the duckduckgo2 python module is not prepared to work with python3, so I ported it. The same happens with the aiml python module, which seems to be abandoned since 2014 and only works with python2. Fortunately, in this case there’s a python-aiml fork, which adds support for python3 and other improvements, so I made mycroft use that one instead.

Some example commands

This is a small list of questions and commands you might like to try:

Hey Mycroft …

  • What is 2 + 2?
  • What is 21% of 314?
  • What is the capital of Spain?
  • When was Alan Parsons born?
  • How high is the Eiffel Tower?
  • Search the web for ethernet cables
  • Set an alarm in 5 minutes (after the alarm is triggered, you can say “Hey Mycroft, stop alarm” to stop it)
  • Remind me to watch the oven in 3 minutes (after the reminder is triggered, say “Hey Mycroft, stop reminder” to stop it)
  • Tell me a joke
  • Tell me about the Solar System
  • Play the news (when you want it to stop, just say “Hey Mycroft, stop”)
  • Open Dolphin
  • Close Firefox
  • Decrease volume
  • Show Activities
  • What time is it?
  • What’s the weather like?
  • Will it rain?
  • Type this is a test (it will write “this is a test” on your current window as if you used the keyboard)

Configuration

After you play a bit with it and test the basic functionality works, you might want to configure Mycroft for your settings. I recommend to at least open the ~/.mycroft/mycroft.conf file and change the example location settings to your city, your coordinates (look for your city on Wikipedia and press on the coordinates in the right side box to see your city coordinates in decimal notation) and your timezone (the “offset” value is your timezone difference with respect to GMT in milliseconds and “dstOffset” is the daylight saving time offset which is usually AFAIK, generally 1 hour).

When changing the configuration file, be extremely careful and don’t leave any blank line nor introduce any comment, since currently the json parser is very sensitive to syntax errors (fortunately, you’ll see clear errors in the logs if there’s any). In any case, be sure to have a backup config file, just in case.

Known problems

  • The first time you start the Mycroft systemd services it will download the 31 default skills which can take long (up to a couple of minutes). So if you start Mycroft from the plasmoid, the first time you do it, the plasmoid will timeout and report a “Connection Error”. Please just wait a couple of minutes, stop Mycroft from the plasmoid and start it again.
  • Sometimes, the plasmoid seems to have trouble remembering its settings, so if you have trouble starting/stopping the services through the UI, go to the “Settings tab” and change “Your Mycroft core installation path” to “Default Path” and then back to “Installed using Mycroft Package” again (just clicking on one and then the other is enough).
  • Some skills don’t support python3 out of the box yet. This is really troubling since some of those failing are installed by default . I sent pull requests to fix 5 skills (fallback-wolfram-alpha, skill-reminder, skill-desktop-launcher, skill-youtube-play and mycroft-youtube) but they haven’t been merged yet, so I’ve distributed the patches to support python3 within the packages and added support to msm to apply local patches after a skill is installed. Since msm also updates the skills from their git repositories, I made it reverse-apply the patches before any update and apply them again afterwards. This should allow to receive upstream patches from the skill developers and let them have preference over my patches. I’ve also fixed skill-autogui and fallback-duckduckgo to work with python3, but didn’t submit the changes to their corresponding upstreams yet.
  • If you find any other skill is failing, you can check with journalctl --user --since "1 hour ago" the journals and see if the skill is generating any exception. Also, having a look at ~/.mycroft/mycroft-python-modules.log might be a good idea to check if those python packages are installed in the system (note that the openSUSE python packaging guidelines state that the python3 package for a module must be called python3-<modulename> so it should be easy to check manually)
  • Mycroft is creating files under /tmp without proper permissions. This can be allowed on a device like the
    Mycroft Mark 1 or Mark II, but is a security concern on a generic Linux system. I hope to get some time in the next days/weeks to work on this.
  • When installing a skill (using the ui or the msm application) sometimes it doesn’t show up as “installed” but it is. Check the contents of your ~/.mycroft/skills directory to see exactly what is installed.

End thoughts

I have many plans for these packages. For example, I’d like to submit to upstream all changes I’ve done since I think those will be useful for other distributions too and to help get it to work with python3 as soon as possible. As mentioned before, I’d also like to make a pip/zypper integration tool so skills requirements can be installed automatically in the system and I’d like to add a skill to Mycroft to integrate it with one application I’m developing (more on this in future posts 🙂 ) . If nobody does it first, it would be great to add Spanish support now that it seems support for languages other than English is being added.

Btw, Mycroft developers are adding support for the Mozilla open source voice recognition framework, so you might consider collaborating with The Common Voice project to help it grow and mature.

Before ending, I’d like to thank all the python openSUSE packagers (specially Tomáš Chvátal) for carefully, patiently and quickly reviewing python package submissions for over 50 new packages required by mycroft-core and of course, the Mycroft developers and Aditya Mehra (the plasma-mycroft developer), who are doing a great job.