This forum uses cookies
This forum makes use of cookies to store your login information if you are registered, and your last visit if you are not. Cookies are small text documents stored on your computer; the cookies set by this forum can only be used on this website and pose no security risk. Cookies on this forum also track the specific topics you have read and when you last read them. Please confirm whether you accept or reject these cookies being set.

A cookie will be stored in your browser regardless of choice to prevent you being asked this question again. You will be able to change your cookie settings at any time using the link in the footer.

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Weighted random generation (names etc.) in Java
#1
I'm looking through the code and noticing (in the random name generation) the following comment:

Code:
        // TODO: how do I weight name vectors by frequency, without making them
        // gargantuan?

Ok, here's how: You use TreeMap's lowerEntry() method.

A bit of code ... Assuming you have read your names and corresponding weights into the following data structure.

Code:
    private List<Pair<String, Integer>> nameList;

... which, in the context of name generation, would probably look like follows:
Code:
    /* 40000 should be enough for the current firstnames_female.txt */
    nameList = new ArrayList<Pair<String,Integer>>(40000);
    /* ... loop through the lines */
        String name = values[0];
        int weight = Integer.parseInt(values[1]);
        nameList.add(Pair.<String,Integer>of(name, weight));
    /* ... */

You can now generate the TreeMap with the weights as follows:

Code:
    private TreeMap<Integer, String> nameDistribution;
    private int maxNameWeight;

    private void generateNameDistribution()
    {
        int count = 0;
        for( Pair<String, Integer> name : nameList) {
            nameDistribution.put(count, name.first);
            count += name.second;
        }
        maxNameWeight = count;
    }

The format of the data is { (sum of weights until this one) : (name) } - Easy enough.

Now to get a random weighted name out of it:

Code:
    private String getRandomName(Random rnd)
    {
        return nameDistribution.lowerEntry(rnd.nextInt(maxNameWeight) + 1).getValue();
    }

That's it. It works pretty well even with non-integer weights if you want. Hope it helps.
Reply
#2
Are you looking in trunk with the latest revision? I rewrote large parts of the random name genedator not too long ago and that comment is no longer relevant. Either you are looking at old code or I forgot to remove the comment (which is probably the case.)
Reply
#3
Looking in the trunk (megamek.client.RandomNameGenerator, which means it's in MegaMek as well), and the comment is there still - as well as generating multiple datasets per entry depending on how much weight they have. Lines 166-178 there as an example:

Code:
if (!firstm.containsKey(key)) {
  Vector<String> v = new Vector<String>();
  while (i < weight) {
    v.add(name);
    i++;
  }
  firstm.put(key, v);
} else {
  while (i < weight) {
    firstm.get(key).add(name);
    i++;
  }
}

I think the TreeMap method is generic enough to be of interest anyway though, even if you don't want to change the name generation specifically (Never touch a running system ...).
Reply
#4
Yes, you're correct.  I didn't have the source at hand when I responded before.  I looked, and it was RandomUnitGenerator which I changed, which was more egregious.  I had meant to go back and fix RandomNameGenerator as well when I fixed RandomUnitGenerator, but I didn't get around to it.  RandomUnitGenerator was taking up a very large amount of extra memory due to the number and size of RATs (plus new ones were being added).  RandomNameGenerator does waste memory, but not on the scale on RandomUnitGenerator.  It should be fixed.  If you want to submit a patch to the patch tracker (https://sourceforge.net/p/megamek/patches/), I can look into getting it added.
Reply
#5
I'll look into creating a patch later. For now, I attached a slightly cleaned up WeightedChoice class I'm using for such purposes. While I didn't test it too extensively, it should work for most cases. Feel free to use it anyway you like (in doubt, assume it's licensed as CC0 or some such).


Attached Files
.java   WeightedChoice.java (Size: 5.9 KB / Downloads: 2)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  MekHQ built in Scenario generation Degrathom 3 1,399 10-03-2017, 10:21 AM
Last Post: Arlith
  Standard Skills Instead of Random Longshot 1 898 02-14-2017, 04:48 AM
Last Post: pheonixstorm
  Contract Generation Issue Zellbringen 7 3,042 03-06-2015, 12:18 PM
Last Post: scJazz
  Anybody know how to restrict the firstnames in the random name generator? DaddyHolby 7 4,176 05-22-2012, 10:21 PM
Last Post: DaddyHolby
  Since MekHQ doesn't (yet) include a random skill roller, I wrote this up... ralgith 14 6,221 01-10-2012, 12:04 AM
Last Post: ralgith

Forum Jump:


Users browsing this thread: 2 Guest(s)