Generating bigrams, performance -- Scala vs Java

I am currently working at a start-up that is using Scala and Play Framework -- both have their pros and cons. I come from a more traditional programming background of Java style languages so it's been an adjustment. Recently while analyzing some code, I ran into some strange performance discrepancy between Java and Scala.

We currently have code that generates bigrams as part of Dice Coefficient across millions of records that became a bit of a bottleneck. While trying to optimize the code, I noticed that these two, almost identical, implementations performer very differently. I am going to post this question to StackOverflow and see if anyone has ideas as to why.

Code

Results

I ran this a few times using different strings as input but consistently the Java version was much fasters. In the end, I picked a string that creates about 11 bigrams and ran it 1M times. The Java version was about 68% faster!

Scala version (about 1985 ms)

scala> Util.time(for(i<-1 to 1000000) {Util.toBigramsScala("test test abc de")})
17:00:05.034 [info] Something took: 1985ms

scala> Util.time(for(i<-1 to 1000000) {Util.toBigramsScala("test test abc de")})
17:00:08.417 [info] Something took: 1946ms

scala> Util.time(for(i<-1 to 1000000) {Util.toBigramsScala("test test abc de")})
17:00:11.452 [info] Something took: 1970ms

Java version (about 623 ms)

scala> Util.time(for(i<-1 to 1000000) {Util.toBigramsJava("test test abc de")})
17:01:51.597 [info] Something took: 623ms

scala> Util.time(for(i<-1 to 1000000) {Util.toBigramsJava("test test abc de")})
17:01:53.094 [info] Something took: 620ms

scala> Util.time(for(i<-1 to 1000000) {Util.toBigramsJava("test test abc de")})
17:01:54.519 [info] Something took: 606ms

System

I ran this on Ubuntu 14.04, with 4 cores and 8Gig RAM. Java version 1.7.0_45, Scala version 2.10.2.

Welcome to Scala version 2.10.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45).

Conclusion

In the end we ended up using code from the wikibooks that performed very well, almost 10x the Scala code. It also didn't use Sets as we wanted to account for duplicates.