New LibreOffice GSOC project: resolve deadlocks using robust and efficient implementation of Ostrich algorithm

This year’s GSoC is coming; and this year, I suggest that we handle one big problem plaguing LibreOffice: deadlocks.

Users know that sometimes, program hangs. Often that is because of deadlocks. It is well known that one of industry’s most widely used ways to handle this problem is Ostrich algorithm [1].

This proposal is to audit the LibreOffice core code for possible deadlocks, and handle all the found places using the most robust and efficient implementation of Ostrich algorithm. The task includes study of available implementations; the chosen one should be efficient, robust, and available under a compatible open-source license.

Students that choose this task may assume that I would gladly mentor their work on this.

Happy hacking!

[1] https://en.wikipedia.org/wiki/Ostrich_algorithm

Multiple columns in LibreOffice text boxes

Thanks to SUSE, our valued partner who supported this development, we at Collabora Productivity have implemented support for multi-column layout in LibreOffice’s text boxes.

Up to now, it was only possible to use columns in Writer’s page styles, sections and frames. One could not make text boxes, including those used in Impress, with text distributed to several columns. Well, there are workarounds like using tables, but indeed that was not the same, and was breaking the text flow.

Introduced simple columns in text boxes (you just set up number of columns and spacing between them, no per-column width or spacing) are supported in Open Document format files (ODT/ODS/ODP/ODG). At the same time we also introduce support for the related feature in PPTX (tdf#118458) and XLSX files. That improves interoperability. However mind that multiple columns in text boxes are not supported in Word and its file formats, thus columns that you set up with this new feature in text boxes in Writer, can not be exported to DOCX.

The new setting is available in the “Text Attributes” dialog:

This is how it looks in current master:

This feature will be available in the future LibreOffice 7.2, and the next Collabora Office update. If you want to try it, you may take latest Collabora Office snapshots on this page.

Soft edge effect on objects in LibreOffice

After implementing glow effect recently, we at Collabora Productivity also implemented soft edge effect for objects in Draw and Impress. And again, that was done thanks to SUSE who made it possible.

The relevant bug report‘s duplicate contains a sample that I use here for illustration. First, take a look at how it was before:

How it was in LibreOffice 6.4

Now let’s see what it looks like now:

How it looks in LibreOffice 7.0

And finally here is the reference look of the slide:

Reference

This will be available in the upcoming LibreOffice 7.0.

Glow effect on objects in LibreOffice

Thanks to SUSE who made this possible, now we have glow effect on objects in upcoming LibreOffice 7.0. Collabora Productivity engineers Tamás Bunth and myself together have implemented it for shapes and pictures.

Below are some screenshots of a PPTX slide with glow samples collected from the relevant bug report:

How it was in 6.4
How it looks like in master towards 7.0
Reference look

What puzzles me is why fontworks’ (right bottom) glow is not shown in the reference, although the effect is present in its properties. Somehow now LibreOffice seems to support glow in fontworks better 😉

Glow on pictures is only implemented in Impress and Draw. Glow on shapes is available in all modules.

Proper console mode for LibreOffice on Windows

LibreOffice has always supported usage of command line switches that allow operations like conversion of documents to different file types, or batch-printing. Using LibreOffice CLI in various scripts is a very common scenario.

But until now, it had somewhat suboptimal support for this on Windows. The main executable module – soffice.bin – being a GUI subsystem application, it could not properly output its messages to the calling console, as well as return error codes to check ERRORLEVEL for success. The hacks used to redirect the output of the GUI application to the calling console were unreliable and didn’t work at all on some supported versions of Windows. Sometimes one could not even see why the entered command line was rejected as invalid.

I have just pushed a commit that changes the situation. Now LibreOffice has proper console mode on Windows. soffice.bin is now built for console subsystem, which allows using it in abovementioned scenarios, having the stdout and stderr output, as well as return code, properly sent to console (or redirected using normal means); in debug builds, the debug output is also visible on the console. To allow comfortable usage, a new console launcher executable is introduced, soffice.com, in LibreOffice installation’s program/ folder, alongside with familiar soffice.exe, which is retained for all GUI uses, as before. This allows to continue using command lines like
"c:\Program Files\LibreOffice\program\soffice" --convert-to odt file.doc
from cmd.exe command-line interpreter, without specifying the executable extension, and have the soffice.com launched to have proper console operation (subject to value of PATHEXT environment variable). The command properly “owns” the console (does not return to command prompt) until soffice finishes.

The change will be available in LibreOffice 6.3 scheduled for Summer 2019 (if testing does not reveal a major problem which would require to revert this). I hope this will make use of LibreOffice CLI more comfortable for Windows users, on par with other platforms. If you find any problems with the solution, please report bugs to our bug tracker. Early testing using daily builds is much appreciated!

Auto-hiding empty mail merge fields

Managing empty fields: the status quo

When doing mail merge, it’s often (even usually) desirable that if a database field is empty for a recipient, then the corresponding line be hidden in the generated document. LibreOffice has always allowed doing this using special Hidden paragraph fields – which is very flexible, though not too user-friendly, because of its complexity in creation and support. E.g., one needs to remember to move the field along with the database field when one edits the document; or change the field’s conditions when renaming database fields or combining fields in a single paragraph.

There are situations when using Hidden paragraph fields is even impossible. Since the condition in the said field depends on a registered database name, it cannot be used when there’s no registered database (which happens when one wants to connect to data sources dynamically, when one is actually performing the merge).

Meet the convenience

Today we have released the new Collabora Office 5.3-49, which includes the improvement that we at Collabora Productivity have implemented: now database fields also hide paragraphs themselves when the field value is empty: now there’s no need to use separate fields for that. This allows for easier creation and management of the auto-hiding empty database values.

HiddenPara

With the change, we are also more interoperable with other office suites that behave that way, including Microsoft Office.

This feature is controlled by a new compatibility option, which is enabled by default in all new documents. If one wants to return to old behaviour, however, one can easily do that using Writer’s compatibility options.

mmhides

As usual, the improvement is also available in the next major release of LibreOffice, which is to be released in August.

Even easier configuration of AD integration for LibreOffice

What we had achieved previously

After we at Collabora Productivity had improved existing LDAP configuration backend to be relatively easily configurable for Windows clients in ActiveDirectory-based domain environment, we started to prepare a further improvement in this area, which purpose was to overcome the problems of LDAP-based backend. The said problems are caused by the fact that LDAP backend needs to have proper credentials for server connection explicitly configured, which leads to requirement to have a dedicated restricted service account which would have a fake password (which would be written in clear text on each configured workstation), and which only purpose is connecting to LDAP server and retrieving user information. The said approach hardly fits into Active Directory’s concept with single sign-on (SSO) in its heart. Of course, the preferable solution would be to have a configuration backend that could get user data from AD using current user’s credentials, without the need to have a service account for that.

Collabora makes the next step

In the past weeks, we have merged a brand new backend plugin (WinUserInfoBe), which uses the improvements in core made when working on LDAP backend, and which implements the said concept. It is, naturally, even easier to configure than LDAP backend (the only thing required is to set a user data field to be taken using the said plugin); neither server connection configuration, nor further LO data field to LDAP object property mapping is needed. And of course, we have made the necessary changes to our ADMX template to make this configurable using convenient GPO editor interface.

Upgrade and benefit

This change is immediately available in the released Collabora Office 5.3-49 for our customers – why not upgrade now and have a play? – and it will also be in the next major release of LibreOffice 6.1 due to be released in August 2018.

Configuring LibreOffice using GPO to take user data from Active Directory

Introducing the problem

User data in LibreOffice (Options→LibreOffice→User Data) is used for a number of purposes; among them are setting authors of documents, comments and modifications (when in Change Tracking mode); and saving cursor position in documents (so that the authors would jump right to the place they worked at last time, while other readers would see the document from the beginning).

When the data is missing, collaborative work may suffer from e.g. anonymous changes or comments appearing in documents; that’s why it’s usually important for organizations to care that LibreOffice instances have this information filled in properly – and of course, the configuration should be centralized.

Pre-existing (awkward) solution

LibreOffice has always had ability to get user data using LDAP (a protocol to communicate with different directory services). Using that, it was already possible to setup LibreOffice to take user data (name, organization, address, etc.) from Active Directory (which is also accessible using LDAP) by doing some post-deployment steps: creating a custom .xcd file from a template (oo-ad-ldap.xcd.sample) shipped with LibreOffice, and deploying it to share\registry sub-folder of LibreOffice installation on client systems. However, this is not a “native” way to centrally configure Windows applications in an Active Directory domain; among other things, one needs to invent a mechanism to deploy the .xcd (e.g., using a .bat file at startup, or modifying MSIs used to install LibreOffice); debug it, and support (like when LibreOffice install path changes, as happened with LibreOffice 6.0, which now installs into LibreOffice – previously install path included major version number; and .bat script would need to account for that, and possibly also work in an environment where different LO versions exist on different systems).

A new shiny way to do this

Starting with Collabora Office 5.3-48, and coming with LibreOffice 6.1 in August, it is also possible to use a GPO Administrative Template that we at Collabora Productivity created for our partner Studio Storti, to setup this, which allows system administrators running Active Directory domains to perform the configuration conveniently using familiar tools and workflow. Please see the documentation file describing this in details, in the folder with the ADMX file.

Big consequences of a good bug report

Last week I had fixed a trivial bug (a leftover from a former change where a function’s return was changed, but one place of its usage managed to escape to be not converted to properly treat the changed return). It seems to simultaneously have fixed a number of other bugs (the discussion may be found in the bug tracker issue). The little (a few characters) bug turned out to create both performance issues, and clipping of characters, so it had big impact on LibreOffice on Windows (with DirectWrite, e.g. when OpenGL is used).

The problem became trivial both to find and fix, because of great bug report by Telesto, who not only filed the report, but also had provided every relevant piece of information, including terminal output accompanied the problem manifestation. I cannot emphasize enough the importance of this: the effort of the bug reporter makes a difference. Without the effort, some problems remain very difficult for developers to be tracked down and get fixed.

I write this to praise Telesto‘s great job, and urge every reporter of a bug to follow this great lead.

Windows Unicode API usage in LibreOffice

Windows still provides two sets of many of its Win32 API functions taking or returning strings: a legacy “Ansi” (functions named like fooA) and Unicode (named like fooW; available since Windows NT, and in Windows 95 with Layer for Unicode – and thus on any Windows OS supported by LibreOffice).

The “Ansi” functions take 8-bit strings in current codepage (single- or multibyte). The repertoire of characters representable in those strings is, naturally, limited to that codepage (that is either setup in system’s Language for non-Unicode programs, or explicitly set by running application). Unfortunately, unlike in other contemporary OSes, Windows doesn’t allow setting its locale to use UTF-8. If a string arrives to such a function that contains characters outside of that set, the string content will be altered, and functions’ behaviour might change unexpectedly.

“W” versions of those functions take UCS-2 strings, that are able to represent most of Unicode range (I am unsure if those strings are actually UTF-16, and so are able to represent the full Unicode repertoire, but anyway, even UCS-2 is much wider than most of single- or multi-byte codepages).

In last two weeks, we have replaced many places (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D) in LibreOffice codebase where legacy “A”-functions were still used, with explicit calls of their “W”-counterparts, removing redundant conversions of strings from LibreOffice internal UTF-16 string representation to 8-bit strings and back. One of most significant effects might be on file-management functions, where such conversions could alter paths/names containing Unicode characters not representable in currently selected 8-bit codepage, and lead to failed file operations. One example of such problems is tdf#103525.

The changes are included into master towards 6.0.