Some days ago, Richard Kulisz published an article on meta-leveling. Since at that time I've had the state of current communcation technology on my mind anyways, I decided to post a small analysis on this blog. Designing a concrete new model is left as an exercise for the reader.

Forums and Newsgroups

There are mailing lists, web forums and newsgroups[1]. One uses email, the second a web browser and the third a newsreader client.

Mail is very low-level. Thus, mailing lists don't allow categorization/grouping and there is no archive per se. The last point has long been addressed, but is an obvious hack. While the interface to different mailing lists is unified, this is not case for mailing list archives. Even the archive of a certain mailing list has to be accessed via a completely different one[2]. The fact that categorization is impossible leads to the often observed solution of creating several lists, each addressing a specific category (bugs, users, developers, etc).

Forums on the other hand allow categorization and present the whole archive using the same interface. The problem: that interface completely sucks[3]. Every forum has another one and it's impossible to access multiple forums using the same interface.

Yes, this is something newsgroups got right[4]. But categorization is hierarchical in both cases, something that should nowadays at least be questioned thoroughly.

Abstracting from some conrete properties, we can conclude that the only difference is that mailing lists are "push" but the other two technologies "pull". Strictly pushing updates implies that the whole data set has to be transferred -- you can't pull by definition. On the other hand, pulling updates allows individual selection of elements from the data set.

Newsfeeds

As you might have noticed, I'm currently developing a news aggregator. So I've been working with the technology of newsfeeds (RSS specifically). It's emberassing that it took me so long, but after about one week my subconscious has brought this issue to the meta-level.

A newsfeed provides a asynchronous simplex communication channel from N authors to M readers. Since the newsfeed technology also uses by definition a central access point there are practically two seperate communication channels, one N:1 and the other 1:M.

Now let's pretend this communcation was synchronous. So when a news item gets posted, the readers directly receive them. Now replace "news item" with "email" and "posted" with "sent", and you get the point. Using the same argumentation, the set of readers can also be viewed as subscribers to a specially moderated mailing list.

So the only difference in email and newsfeeds seems to be the method of communcation (push vs. pull) and the fact that their narrow, non-extensible sets of attributes overlap only partially.

Instant Messaging

Some people might think that instant and non-instant messaging are conceptually similar. If this were the case, we could just conclude that the concepts of email and instant messaging are the same on the meta-level (like in the sections above).

These thoughts seem popular and recently even Facebook redesigned their communication system that way. However, many users complained.

After some thoughts it should be clear that instant messages and email are different concepts. Instant messaging provides direct/interactive, informal chatting. Mail, however, is more formal and indirect/non-interactive[5]. There are different use cases for why to pick one over another.

Facebook had it right before. Chats and messages were clearly divided, but tightly integrated. This was, in fact, even more conceptually sound than most of the "real" instant messaging systems that include non-instant messaging as well.

Conclusion

Is it necessary to have very different technologies just because we sometime want it to be synchronous, and sometimes asynchronous? I don't think so. We could save half the effort by using synchronous protocols in combination with a proxy[6][7].

It should be clear that I'm not arguing for replacing all this technology by email. Okay, maybe I am. As long as we're using rudimentary operating systems, maybe we should use rudimentary technology, too, iff this allows us to leverage a larger part of the OS.

But this is backwards!

In Richard's view meta-leveling is a skill that is lacking in most programmers. I'm not sure about this. What I'm sure about is that meta-leveling is far not used enough. If it was used more thoroughly, computing would be much more elegant and powerful.


  1. Nowadays, newsgroups are less often used. But they are (especially) useful when compared against web forums.
  2. This hack is necessary. You can imagine what would happen if users get send the whole archive after subscription.
  3. HTML interfaces so poorly with environment, web browser and user that honestly it shouldn't even qualify as an interface.
  4. In general this is, in fact, false if we strictly compare their feature set against the one of an average web forum.
  5. This can be seen especially well when comparing their means of file transmission. Instant messaging directly streams the data, email transfers it (usually indirect) as "a single piece".
  6. Compare to email, which is synchronous but almost every user fetches them asynchronously from their mail server, for example by using the program fetchmail.
  7. With a sufficiently central proxy, probably on the same machine that usually sends the data, fetching it will provide the same anonymity as current services (e.g., newsfeeds).
Posted 2011-05-03 21:58 Tags: networking

Installation of Broken Systems

Why would somebody want to install broken systems? Well, when there is no alternative to broken systems, you'll have to chose one of them.

Gentoo GNU/Linux is my broken system of choice. It gives me the most freedom and allows me to use the actual operating system, i.e. I can use software that just sucks instead of software that sucks more, and I get exposed to all of the pitfalls and unconceptualized semi-solution of unixoid systems. Valuable lessons.

And, of course, I had not to wait long for the next flaw.

The basic GNU system was installed and I had entered a chroot. Next was installation of various software using Portage, Gentoo's packet manager. Portage, or rather wget, refused to to download sources. The mentioned address resolution problems were a bad excuse, since downloading exactly the same urls by calling wget myself in the very same instance of bash worked.

Allright, I wanted to try Paludis, the other package mangler anyways since they claim they had concepts and well-factored code as opposed to Portage, which had neither[1]. And, lucky me, this one could fetch source packages. Installation could continue.

By the way, the machine to be equipped with Gentoo is a netbook featuring an AMD Geode LX800. Of course it would be a dumb idea to compile a whole OS on a single machine equipped with a 500MHz CPU. But that's what distribution is for -- and that's where the fun begins.

Unix in itsself has no support for distribution whatsoever. Distcc has been written to "address" this issue by providing a hackish solution for compilation on multiple machines. Once installed, the first workaround has to be applied[2].

Configuration is pretty straight-forward. There is /etc/env.d/02distcc, /etc/distcc/hosts and the program distcc-config. The latter is a workaround to hide the ugliness of multiple configuration mechanisms. (FYI, the list of configuration files I mentioned does not include the seperate configuration file for the included server).

To actually use distcc, $PATH has to be modified additionally. Now, compilation will be distributed.

At least in theory.

Distcc couldn't distribute the compilation because "failed to distribute". How smart. Okay, let's enable verbosity. Output now contains various information like time in milliseconds the compilation took, and I'm told that distcc could not distribute because "failed to distribute". Fuck you, I'm pretty aware of that.

After some time fighting configuration files and distcc-config, I noticed that both "distcc-config --get-hosts" and "distcc --show-hosts" return, regardless of configuration setting, "+zeroconf". Yeah, distributing compilation to a host named +zeroconf won't work. And what's the sanest message to report this problem when the maximum of verbosity is requested? "Failed to distribute"? Bullshit.


References

  1. See Paludis FAQ on http://paludis.pioto.org.
  2. See Gentoo Distcc Manual.
Posted 2010-02-14 17:02 Tags: networking

File Transfer Pain

EverythingIsAFile, right? Wrong.

Unix is undistributed. You can't, for example, just work with files regardless of where they are located.

One good example is FTP. Traditionally, working with files via FTP on a user interface level is done by using an FTP client. This is probably the most brain-dead way how such a task could be accomplished:

  • The user's shell loses part of its control,
  • input is handled by the shell that the FTP client provides.
  • No access to external programs or at least not in the usual way.
  • No output redirection or similar shell features.
  • Files can not be created in the usual manner, if at all.

In short, there is no consistency at all in administering files locally vs. via FTP. Simple things are inconvenient, complex things nearly impossible. The client is the limiting factor of all the features your operating system or your shell provide.

A workaround is setting up an FTP filesystem which allows you to mount a directory of an FTP server to your local filesystem hierarchy, thus allowing you to almost work with them like usually.

This is, AFAIK, the default on Plan9. On GNU/Linux it can be done by using curlftpfs:

curlftpfs -f ftp.example.org ~/ftp.example.org

When the domain is mounted, you can use rsync to upload new files:

rsync -Pprvu --delete ~/mirror/ftp.example.org ~/ftp.example.org
Posted 2009-10-09 15:00 Tags: networking

You might want to check out the archive of posts tagged "networking".