As our way of thanking you for your positive contributions to Slashdot, you are eligible to disable advertising.

10 years of commenting on slashdot finally paid off!

Sadly, the site is not what it used to be. It seems to be mostly reposting stuff I already read elsewhere. It used to be the case that they actually had the news when it was still hot. Part of the attraction always was their rapid response time to news events. That seems to have completely gone. I read about steve jobs liver transplant a full day ahead of the slashdot post.

I used to be a pretty regular commenter there. Now I tend to comment less and less.

No Comments

OpenID, the identity landscape, and social networks

I’m still getting used to no longer being in nokia research center. One of my disappointments of being in NRC and being a vocal proponent of openid, social networks, etc. was that despite lots of discussion on this topic not much has happened in terms of me getting room to work on these topics or me convincing a lot of people about my opinions on these topics. I have one publication that is due out whenever the magazine involved gets around to approving and printing the article. But that’s it.

So, I take great pleasure in observing how things are evolving lately and finding that I’ve been pushing the right topics all along. Earlier this week, Facebook became a relying party for OpenID. Outside the OpenID community and regular techcrunch readers, this seems to have not been a major news story. Since, just about anybody I discussed this topic with in the past few years (you know who you are) always insisted that “no way that a major network like Facebook will ever use OpenID”. If you were one of those people: admit right now that you were wrong.

It seems to me that this is a result of fact that the social networking landscape is maturing. As part of this maturation process, several open standards are emerging. Identity and authentication are very important topics here and it seems the consensus is increasingly that no single company is going to own all 6-7 billion identities on this planet. So naturally any company with the ambition to potentially separate 6-7 billion individuals from their money for some product or service, will need to either work with multiple identity providers.

So naturally such companies require a standard for doing so. That standard is OpenID. It has no competition. There is no alternative. There are plenty of proprietary APIs that only work with limited sets of identity providers but none like OpenID that can work with all of them.

Similarly, major identity providers like Google, Facebook are stuck at sharing a few hundred million users between them, they shift their attention to somehow involving all those users that didn’t sign up with them. Pretty much all of them are OpenID providers already. Facebook just took the obvious next step in becoming a relying party as well. The economics are mindbogglingly simple: Facebook doesn’t make money from verifying peoples identity but they do make money from people using their services. OpenID relying party means the group of people who can access their services just grew to the entire internet population. Why wouldn’t they want that? Of course this doesn’t mean that world + dog will now be a Facebook user but it does mean that one important obstacle has just disappeared.

BTW. Facebook’s current implementation is not very intuitive. I’ve been able to hook up myopenid to my facebook account but I haven’t actually found a login page where I can login with my openid yet. It seems that this is a work in progress still.

Anyway, this concludes my morning blogging session. Haven’t blogged this much in months. Strange how the prospect of not having to work today is energizing me :-)

No Comments

wolframalpha

A few years ago, a good friend gave me a nice little present: 5 kilos of dead tree in the form of Stephen Wolfram’s “A new kind of science”. I never read it cover to cover and merely scanned a few pages with lots of pretty pictures before deciding that this wasn’t really my cup of tea. I also read a bit some of the criticism on this book from the scientific community. I’m way out of my league there so, no comments from be except a few observations:

  • Presentation of the book is rather pompous and arrogant. The author tries to convince the readers that they the most important piece of science ever produced in their hands.
  • This is what set of most of the criticism. Apparently, the author fails to both credit related work as well as properly back up some of his crucial claims with proper evidence.
  • Apparently there are quite a few insufficiently substantiated claims which affects credibility of the overall book and claims of the author
  • The approach of the author to write the book has been the ivory tower approach where he quite literally dedicated a decade+ of his life to writing it during which he did not seek out much criticism from his peers.
  • So, the book is controversial and may either turn out to be the new relativity theory (relatively speaking) or a genuine dud. I’m out of my league deciding either way

Anyway, the same Stephen Wolfram has for years been providing the #1 mathematical software IDE: Mathematica, which is one of the most popular software tools for anyone involved with mathematics. I’m not a mathematician and haven’t touched such tools in over 10 years now (dabbled a bit with linear algebra in college) but as far as I know, his company and product have a pretty solid reputation.

Now the same person has brought the approach he applied to his book and his solid reputation as a owner of Mathematica to the wonderful world of Web 2.0. Now that is something I know a thing or two about. Given the above I was initially quite sceptic when the first, pretty wild, rumors around wolframalpha started circulating. However, some hands on experience has just changed my mind. So here’s my verdict:

This stuff is great & revolutionary!

No it’s not Google. It’s not Wikipedia either. It’s not Semantic web either. Instead it’s a knowledge reasoning engine hooked up to some authoritative data sets. So, it’s not crawling the web. It’s not user editable and it is not relying on traditional Semantic web standards from e.g. W3C (though very likely it must be using similar technology).

This is the breakthrough that was needed. The semantic web community seems to be stuck in an endless loop pondering pointless standards, query formats, graph representations and generally rehashing computer science topics that have been studied for 40 years now without producing much viable business models or products. Wikipedia is nice but very chaotic and unstructured as well. The marriage of semantic web and wikipedia is obvious has been tried countless times and has so far not produced interesting results. Google is very good at searching through the chaos that is the current web but can be absolutely unhelpful with simple, fact based questions. Most fact based questions in Google return a wikipedia article as one of the links. Useful, but it doesn’t directly answer the question.

This is exactly the gap that wolframalpha fills. There’s many scientists and startups with the same ambition but Wolframalpha.com got to market first with a usable product that can answer a broad range of factual questions with knowledge imported into its system from trustworthy sources. It works beautifully for facts and knowledge it has and allows users to do two things:

  • Find answers to pretty detailed queries from trustworthy sources. Neither Wikipedia nor Google can do this, at best they can point you at a source that has the answer and leave it up to you to judge the trustworthyness of the source.
  • Fact surfing! Just like surfing from one topic to the next on Wikipedia is a fun activity, I predict that drilling down facts on wolframalpha is a equally fun and useful.

So what’s next? Obviously, wolframalpha.com will have competition. However, their core asset seems to be their reasoning engine combined with the quite huge fact database which is to date unrivaled. Improvements in both areas will solidify their position as market leader. I predict that several owners of large bodies of authoritative information will be itching to be a part of this and partnership deals will be announced. Wolframalpha could easily evolve into a crucial tool for knowledge workers. So crucial even that they might want to pay for access to certain information.

Some more predictions:

  • Several other startups will start competing soon with competing products. There should be dozens of companies working on similar or related products. Maybe all they needed was a somebody taking a first step.
  • Google likely has people working on such technologies they will either launch or buy products in this space in the next two years
  • Main competitors of Google are Yahoo and MS who have both been investing heavily in search technology and experience. They too will want a piece of this market
  • With so much money floating around in this market, wolframalpha and similar companies should have no shortage of venture capital, despite the current crisis. Also, wolframalpha might end up being bought up by Google or MS.
  • If not bought up or outcompeted (both of which I consider to be likely), wolframalpha will be the next Google

,

1 Comment

Java Profiling

One of the fun aspects of being in a programmer job is the constant stream of little technical problems that require digging into. This can sometimes be frustrating but it’s pretty cool if you suddenly get it and make the problem go away. Anyway, since starting in my new job in February, I’ve had lots of fun like this. Last week we had a bit of Java that was obviously out of line performance wise. My initial go at the problem was to focus on the part that had been annoying me to begin with: the way xml parsing was handled. There’s many ways to do XML parsing in Java. We use Jaxb. Jaxb is nice if you don’t have enough time to do the job properly with XPath but the trade off is that it can be slow and that there are a few gotchas like for example creating marshallers and unmarshallers is way more expensive than actually using them. So when processing a shitload of XML files, you spent a lot of time creating and destroying marshallers. Especially if you break down the big xml files into little blobs that are parsed individually. Some simple pooling using ThreadLocal improved things quite a bit but it was still slow in a way that I could not explain with just xml parsing. All helpful but it still felt unreasonably slow in one particular class.

So I spent two days setting up a profiler to measure what was going on. Two days? Shouldn’t this be easy? Yes, except there’s a few gotchas.

  1. The Eclipse TPTP project has a nice profiler. Except it doesn’t work with macs, or worse, macs with jdk1.6. That’s really an eclipse problem, the UI is tied to 1.5 due to Apple stopping to support of Cocoa integration in 1.6.
  2. So I fired up vmware, installed the latest Ubuntu 9.04 (nice), spent several hours making that behave nicely (file sharing is broken and needs a patch). Sadly no OpenGL eyecandy in vmware.
  3. Then I installed Java, eclipse, TPTP, and some other stuff
  4. Only to find out that TPTP and JDK 1.6 is basically unusable. First, it comes with some native library compiled against a library that no longer is used. Solution: install it.
  5. Then every turn you take there’s some error about agent controllers. If you search for this you will find plenty of advice telling you to use the right controller but none whatsoever as to how you would go about doing so. Alternatively people tell you to just not use jdk 1.6 I know because I spent several hours before joining the gang of “TPTP just doesn’t work, use netbeans for profiling”.
  6. So, still in ubuntu, I installed Netbeans 6.5, imported my eclipse projects (generated using maven eclipse:eclipse) and to my surprise this actually worked fine (no errors, tests seem to run).
  7. Great so I right clicked a test. and chose “profile file”. Success! After some fiddling with the UI (quite nerdy and full of usability issues) I managed to get exactly what I wanted
  8. Great! So I exit vmware to install Netbeans properly on my mac. Figuring out how to run with JDK 1.6 turned out to be easy.
  9. Since I had used vmware file sharing, all the project files were still there so importing was easy.
  10. I fired up the profiler and it had remembered the settings I last used in linux. Cool.
  11. Then netbeans crashed. Poof! Window gone.
  12. That took some more fiddling to fix. After checking the release notes it indeed mentioned two cases of profiling and crashes which you can fix with some commandline options.
  13. After doing that, I managed to finally get down to analyzing what the hell was going on. It turned out that my little test was somehow triggering 4.5 million calls to String.replaceAll. WTF!
  14. The nice thing with inheriting code that has been around for some time is that you tend to ignore those parts that look ugly and don’t seem to be in need of your immediate attention. This was one of those parts.
  15. Using replaceAll is a huge code smell. Using it in a tripple nested for loop is insane.
  16. So some more pooling, this time of the regular expression objects. Pattern.compile is expensive.
  17. I re-ran the profiler and … problem gone. XML parsing now is the bottleneck as it should be in code like this.

But, shouldn’t this just be easy? It took me two days of running from one problem to the next just to get a profiler running. I had to deal with crashing virtual machines, missing libraries, cryptic error messages about Agent Controllers, and several unrelated issues. I hope somebody in the TPTP project reads this: your stuff is unusable. If there’s a magic combination of settings that makes this shit work as it should: I missed it, your documentation was useless, the most useful suggestion I found was to not use TPTP. No I don’t want to fiddle with cryptic vm commandline parameters, manually compiling C shit, fiddle with well hidden settings pages, etc. All I wanted was right click, profile.

So am I now a Netbeans user? No way! I can’t stand how tedious it is for coding. Run profiler in Netbeans, go ah, alt tab to eclipse and fix it. Works for me.

No Comments

Localization rant

I’ve been living outside the Netherlands for a while and have noticed that quite many web sites are handling localization and internationalization pretty damn poorly. In general I hate the poor translations unleashed on Dutch users and generally prefer the US English version of UIs whenever available.

I just visited Youtube. I’ve had an account there for over two years. I’ve always had it set to English. So, surprise, surprise, it asked me for the second time in a few weeks, in German, whether I would like to keep my now fully Germanified Youtube set to German. Eehhhhh?!?!?! nein (no). Abrechen (cancel)! At least they ask, even though in the wrong language. Most websites don’t do even bother with this.

But stop and think about this. You’ve detected that somebody who has always had his profile set to English is apparently in Germany. Shit happens, so now what? Do you think it is a bright idea to ask this person in German whether he/she no longer wants the website presented in whatever it was set to earlier? Eh, no of course not. Chances are good people won’t even understand the question. Luckily I speak enough German to know Abrechen is the right choice for me. When I was living in Finland, convincing websites I don’t speak Finnish was way more challenging. I recall fighting with Blogger (another Google owned site) on several occasions. It defaulted to Finnish despite the fact that I was signed in to Google in and have every possible setting Google provides for this set to English. Additionally, the link for switching to English was three clicks away from the main page. Impossible to do unless you know the Finnish word for preferences, language, and OK (in which case you might pass for a native speaker). I guess I’m lucky to not live in e.g. China where I would stand no chance whatsoever to guess the meaning of buttons and links.

The point here is that most websites seem to be drawing the wrong conclusions based on a few stupid IP checks. My German colleagues are constantly complaining about Google defaulting to Dutch (i.e. my native language, which is quite different from Deutsch). Reason: the nearest Nokia proxy is in Amsterdam so Google assumes we all speak Dutch.

So, cool you can guesstimate where I am (roughly) in the world but don’t jump to conclusions. People travel and move around all the time. Mostly they don’t change their preferred language until after a lot of hard work. I mean, how hard can it be? I’m already signed in, right? Cookies set and everything. In short, you know who I am (or you bloody well should given the information I’ve been sharing with you for several years). Somewhere in my profile, it says that my preferred language is English, right? I’ve had that profile for over four years, right? So why the hell would I suddenly want to switch language to something that I might not even speak? A: I wouldn’t. No fucking way that this is even likely to occur.

It’s of course unfair to single out Google here. Other examples are iTunes which has a full English UI in Finland but made me accept the terms of use in Finnish (my knowledge of Finnish is extremely limited, to put it mildly). Finland is of course bilingual and 10 percent of its population are Swedish speaking Finns, most of which probably don’t handle Finnish that well. Additionally there are tens of thousands of immigrants, tourists and travelers, like me. Now that I live in Germany, I’m stuck with the Finnish itunes version, because I happened to sign up while I was in Finland. Switching to the German store is impossible. I.e. I can’t access the German TV shows for sale on iTunes Germany. Never mind the US English ones I’m actually interested in accessing and spending real $$$/€€€ on. Similarly, I’ve had encounters with Facebook asking me to help localize Facebook to Finnish (eh, definitely talking to the wrong guy here) and recently to German (still wrong).

So, this is madness. A series of broken assumptions leads to Apple losing revenue and Google and others annoying the hell out of people.

So here’s a localization guideline for dummies:

  • Offer a way out. Likely a large percentage of your guesses as to what the language of your users is, is going to be wrong. The smaller the amount of native speakers the more likely you will get it wrong. Languages like Finnish or Chinese are notoriously hard to learn. So, design your localized sites such that a non native speaker of such languages can get your fully localized sites set to something more reasonable.
  • Respect people’s preferences. Profiles override anything you might detect. People move around so your assumptions are likely broken if they deviate from the profile settings.
  • Language is not location. People travel around and generally don’t unlearn the language they used to speak. Additionally, most countries have sizable populations of non native speakers as well as hordes of tourists and travelers.
  • If people managed to sign up, that’s a strong clue that whatever the language of the UI was at the time is probably a language that the user has mastered well enough to understand the UI (or otherwise you’d have blind monkeys signing up all the time). So there’s no valid use case for suggesting an alternative language here. Never mind defaulting to one.

Anyway, end of rant.

No Comments

kamppi.nokia.mobi

I’m rather late with this since it has been more than a month this was news. Busy, moving to Berlin, and other lame excuses. But better late than never.

You might recall a little youtube video I posted back in October of me demoing a prototype at a press event from Nokia. As promised, but slightly later than planned, my colleagues back in Helsinki actually launched (press release) the thing in a place in Helsinki called Kamppi. Kamppi is a shopping mall plus bus station in the center of Helsinki. About 100000 people pass through the building every day. Mostly commuters but also shoppers. There’s several floors with shops, restaurants, etc. It’s an ideal setting for trialing our system and my colleagues worked hard to get the shops in Kamppi on board.

By launching I mean that we opened kamppi.nokia.mobi, which is a mobile website, for the public. You can visit this with your desktop browser of course but the website is designed to be used from a mobile phone with a mobile browser. A broad range of browsers from different phone vendors is supported but for best results you need of course the latest and greatest from Nokia. You can actually use the website from anywhere in the world though admittedly it is a bit pointless to do so unless you are planning to visit Kamppi or are actually in Kamppi (or on your way to Kamppi).

The site we launched has actually less features then the stuff we demoed in October. The reason for this was a change in focus of the trial and not the technology. The trial is now focused on indoor maps, vouchers, shop pages, and ratings. We found that the other features we demoed in October were nice but also a bit confusing to users. They may be added back later on. But since I no longer work in Helsinki, it isn’t up to me. But I put a lot of work in getting this trial going. I helped build and design the software and many of the features and had lots of fun learning Python, Django, and a load of other stuff as well as re-acquainting myself with Apache Lucene, Java, and OSGI.

Some highlights of features I actually worked on/designed:

  • Implicit profiles. Save vouchers in your account without actually signing in. This is a big benefit because most users will never bother to signup. This was implemented after we implemented both OpenID and Nokia Account. In the end the usability argument won. Nobody asks for your ID when you grab a paper voucher at the entrance. Why should digital vouchers be any different?
  • Search. The search server behind the scenes is based on Apache lucene and has custom extensions for indoor location tags, which we use to search through shops, vouchers, ads, etc. Many of the dynamic views have one or more search queries integrated to pull useful info out of the system. And of course there’s a search box as well. The website is in Finnish, which I don’t speak, and I configured Lucene to do Finnish stemming and tokenizing.
  • I didn’t build it but one of my former master thesis students, Daniel Wilms, did build the voucher subsystem for us after I sketched the design on a whiteboard. The voucher subsystem was born quite late in the process (a few months before the October demo) just because we could. We had been looking at voucher systems for some time because the use case was interesting and we decided to just go ahead and build it. We had the first working prototype implemented in about two days. The rest of the development was just tweaking the usability.
    • Additionally me and Jaakko Kyro have lead a team of developers to build this that was always changing, and other stuff, through nearly two years. During this time there were many internal demos, including one to our CEO, and external demos, e.g. at the Internet of Things Conference in 2008.

      So, it’s really nice that this is finally out. Go and check it out!

, ,

No Comments

Photos Nov 2008 -Now

I finally found some time to upload some photos.

I went to Berlin in November to apply for the job I now have. Then I spent Christmas in France. After that, I moved into a temporary flat in Berlin on February 1st. On my first visit back in Finland, I visited a friend in Espoo who lives close to the sea, which was frozen. Finally I took some nice photos of Berlin in my first few weeks here.

The nicest of which is the view I had from my temporary flat:

View from temporary apartment

I no longer live there and photos of my new place are coming soon. I probably will take a few at my upcoming house warming party: Friday 17th from 21:00, feel free to drop by if you are near.

, , ,

No Comments

New theme

After saying good bye to Barthelme I actually considered sticking with the default wordpress theme for a while. Except, it’s ugly and not very nice to use either. So, after sorting by popularity, this one on the wordpress theme directory seemed nice enough. It’s called the Fusion theme and seems pleasant enough and at first glance comes with a couple of nice features. So sticking with this for a while.

I also took the opportunity of fixing the page structure (now that I have a jquery powered menu thingy).

No Comments

Upgrade to wordpress 2.7.1

The joys of moving and starting a new job sadly also means less time for things like upgrading wordpress. I’ve never been so far behind the main version (one major and a subsequent minor release). So, this morning I sat down to do the usually quick and efficient switch svn to wordpress 2.7.1 tag, upload files, upgrade DB and done type ritual.

Except it didn’t work. Damn. I spent nearly two hours running into several problems and trying to fix things. The root cause was three fold.

  1. I did not RTFM.
  2. The little upgrade instructions have subtly changed. Bla bla bla, and oh BTW you should add this stuff to your wp-config.php. Naturally that was a problem.
  3. The theme I have been using for quite some time Barthelme is no longer maintained and wp 2.7.1 doesn’t seem to like it.

Basically, it didn’t like the theme and hence could not show the site. Worse, it could not show the admin site either. All it showed was a blank page. Blank as in, 0 bytes. Blank as in, oops some weird php error and lets just return 0 bytes. Not good. Somewhere in between I did actually manage to run the upgrade db script. That too didn’t have any UI but I could at least get to the upgrade link using view source in firefox. Hint, if you ever get this far, don’t do that. That didn’t make things better. I then tried removing plugin directories and theme directories to make it revert to defaults. Didn’t work, still a blank page. Then I did my usual diagnostics and found out about the wp-config settings, which by now made no difference. Still a blank page.

So, I screwed up and for the first time had to resort to actually using the db backup I thankfully made (would have been in more trouble without that). Except that didn’t work either. Yikes.  Loaded the nice little dump I had made earlier with phpadmin and uploaded. After some time it came back with a nice “Got a packet bigger than ‘max_allowed_packet’” error. Yikes. This one took a while to figure out. Apparently this is some setting you can override if you have the right to do so, which I don’t. Most advice out there points this out “hey just fix my.cnf, restart the server and you’re good”. Yeah right, not much help but thanks. So then I figured out that line length probably was the problem here. Indeed it was since phpadmin seems to default to generating one huge INSERT statement with a comma separated list of all the values. So everything I published since 2005 on this blog on one line and , separated. Sounds like a lot but it’s only 1.5MB or so.

Anyway, the solution is buried in a comment from Gareth in the mysql documentation: http://mysql.telepac.pt/doc/refman/5.1/en/packet-too-large.html

So just search and replace ), by ); INSERT INTO `wp_posts` VALUES. Be careful about doing that only on the insert you want to change BTW.

Anyway, I reverted back to wordpress 2.6.5, imported my fixed dump, fixed my configuration to be prepared to 2.7.1 and checked if my blog survived. It did.

Then I switched the theme to the default theme (i.e. what you are looking at), did the svn switch, upload, upgrade db and this time everything worked.

So, you will have to excuse the odd layout problems for now. Default wordpress theme sucks and I suspect I have some overflow issues here and there. I will be fixing those shortly of course.

No Comments

Time for a little update

Hmm, it’s been more than two months since I last posted. Time for an update. A lot has happened since January.

So,

  • I moved out of Finland as planned.
  • I stayed in a temporary apartment for a month. Central-home is the company managing the facility where I lived (on Habersaathstrasse 24) and if you’re looking for temporary housing in Berlin, look no further.
  • I managed to find a nice apartment for long term in Berlin Mitte, in the Bergstrasse, which is more or less walking distance from tourist attractions like Alexanderplatz, Hackeschermarkt, Friedrichstrasse and of course the Brandenburger Tor.
  • I re-aquainted myself with Java, Java development, and lately also release management. Fun days of hacking but the normal Nokia routine of meetings creeping into my calendar is sadly kicking in.
  • I learned tons of new stuff
  • Unfortunately German is not yet one of those things. My linguistic skills are ever pathetic and English remains the only foreign language I ever managed to master more or less properly. On paper German should be dead easy since I can get by mumbling in my native language and people can still figure out what I want. In practice, I can understand it if spoken slowly (and clearly). Speaking back is challenging.
  • I’m working on it though, once a week, in a beginners class. Relearning stuff that 3 years of trying to stuff German grammar in my head in High-school did not accomplish.

Moving is tedious and tiresome. But the end result is some genuine improvement in life. I absolutely love Berlin and am looking forward to an early Spring. I was in a telco with some Finnish people today discussion the weather. They, so how’s Berlin. Any snow there still? Me: no about 20 degrees outside right now :-). Nice to have spring start at the normal time again. Not to mention the more sane distribution of daylight and darkness, throughout the year.

A shitload of updates is overdue. For several months already. I have a ton of photos to upload. Wordpress needs upgrading. And some technical stuff might need some blogging about as well. Then there is still some unfinnished papers in the pipeline. So, I’ll be back with more. Some day.

, , , ,

No Comments