25 August 2006

Doing Named Entity Recognition? Don't optimize for F1

(Guest post by Chris Manning. Thanks Chris!)

Among ML-oriented nlpers, using a simple F1 of precision and recall is the standard way to evaluate Named Entity Recognition. Using F1 seems familiar and comfortable, but I think most nlpers haven't actually thought through the rather different character that the F1 measure takes on when applied to evaluating sequence models. It's not just that it's a type 4 loss (a simple, intuition-driven measure like accuracy): In most cases such measures are reasonable enough for what they are, but using F1 for NER has an under-appreciated dysfunctional character. You wouldn't want to optimize for it!

This post explains what I was really thinking about when I made the comment that Hal referred to previously (fortunately, I didn't try to say all this after the talk!). I agree with Hal: the paper was a very nice piece of work technically. I just think that the authors, Jun Suzuki et al., chose a bad peak to climb.

Everyone is familiar with the F1 measure for simple classification decisions. You draw a 2x2 contingency table of whether something should be yes/no, and whether the system guessed yes/no, and then calculate the harmonic mean of precision and recall. But now think about Named Entity Recognition. You're chugging through text, and every now-and-again there is an entity, which your system recognizes or doesn't or fantasizes. I will use the notation word/GOLD/GUESS throughout, with O denoting the special background class of not-an-entity. So there are stretches of plain text (drove/O/O along/O/O a/O/O narrow/O/O road/O/O). These are the non-coding regions of NER. Then there are sequences (of one or more tokens) where there was an entity and the system guessed it right (in/O/O Palo/LOC/LOCAlto/LOC/LOC ./O/O), where there was an entity but the system missed it (in/O/O Palo/LOC/O Alto/LOC/O ./O/O), and where there wasn't an entity but the system hypothesized one (an/O/O Awful/O/ORG Headache/O/ORG ./O/O).

Things look good up until here: those events map naturally on to the false negatives (fn), true positives (tp), false negatives (fp), and false positives (fp) of the simple classification case. The problem is that there are other events that can happen. A system can notice that there is an entity but give it the wrong label (I/O/O live/O/O in/O/O Palo/LOC/ORG Alto/LOC/ORG ./O/O). A system can notice that there is
an entity but get its boundaries wrong (Unless/O/PERS Karl/PERS/PERS Smith/PERS/PERS resigns/O/O). Or it can make both mistakes at once (Unless/O/ORG Karl/PERS/ORG Smith/PERS/ORG resigns/O/O). I'll call these events a labeling error (le), a boundary error (be), and a label-boundary error (lbe).

I started thinking along these lines just as an intuitive, natural way to characterize happenings in NER output, where entities are sparse occurrences in stretches of background text. But you can make it formal (I wrote a Perl script!). Moving along the sequence, the subsequence boundaries are: (i) at start and end of document, (ii) anywhere there is a change to or from a word/O/O token from or to a token where either guess or gold is not O, and (iii) anywhere that both systems change their class assignment simultaneously, regardless of whether they agree. If you chop into subsequences like that, each can be assigned to one of the above seven classes.

Now, the thing to notice is that for the first 4 event types, you are either correct or you get 1 demerit, assessed to either precision or recall. In the simple classification case, that's the end of the story and the F1 measure is sensible. But when doing precision and recall over subsequences, there are these other three event types. Each of them is assessed a minimum of 2 demerits, with both precision and recall being hit. Therefore, it is fairly clear that optimizing for F1 in this context will encourage a system to do the following: if I'm moderately uncertain of either the class label or the boundaries of the entity, because a mistake would cost me a minimum of 2 demerits, I'm better off proposing no entity, which will cost me only 1 demerit.

(Two notes:

(i) As I've defined events, the possible demerits for an event in the last three classes is unbounded, though in practice 2 is the most common case. For example, this lbe event would be assessed 4 demerits (3 to precision, and 1 to recall): Smith/ORG/PERS and/ORG/O Newcomb/ORG/PERS and/ORG/O Co./ORG/ORG.

(ii) Despite my title, the problem here isn't with the F measure per se, as Bob Moore emphasized to me at a coffee break during ACL 2006 (thanks!). The problem would occur with any measure that combines precision and recall and which is increasing in both arguments, such as the simple arithmetic mean of precision and recall.)

Observe that this behavior is the opposite of the way things were meant to work: people adopted F1 in IR rather than using accuracy because accuracy gives high scores to a system that returns no documents, which obviously isn't useful. But, here, optimizing for F1 is encouraging a system to not mark entities.

Now let's look at some data. I used this event classification system on the output of my NER system on the CoNLL 2003 shared task English testa data. Here is how many events of each type there were:

tn 5583
tp 4792
fn 118
fp 120
le 472
be 102
lbe 75

Note in particular that over 2/3 of the errors are in those 3 extra categories that are multiply penalized. The ratios of classes vary with the task. For example, in biological NER, you tend to get many more boundary errors. But in my experience it is always the case that lots of the errors are in the last 3 classes.

Moreover, some of the errors in the le and be classes are not that bad, and sometimes even reflect subtle judgement calls and human annotator inconsistency in the gold standard. For instance, in the GENIA data you can find both regulation/O of/O human/DNA interleukin-2/DNA gene/DNA expression and transduction/O to/O the/O human/O IL-2/DNA gene/DNA, where it is unclear whether to include human in the name of the gene. Or in a newswire phrase like the Leeds stadium, it's not always very clear whether Leeds should be tagged ORG as a reference to the football team or LOC as a reference to the city. In almost any imaginable task, you would prefer systems that made these errors to ones that missed such entities entirely. In other words, the F1 measure is punishing more severely mistakes that should be punished less according to reasonable intuitions of task utility.

Has this been noticed before? I think so. The ACE program has a system for giving partial credit. But most ML people react very negatively to a scoring system that you couldn't possibly write on a napkin and which involves various very arbitrary-looking constants.... Do these observations undermine the last decade of work in NER? I don't think so. It turns out that there are lots of measures that are pretty okay providing you do not specifically optimize for them, but are dysfunctional if you do. A well-known example is traditional readability measures.

p.s. As I finish writing this guest post, it's occurred to me that I think this is the first nlpers post with some actual natural language examples in it. If you're reading this post, I guess that at least shows that such content isn't actively filtered out!

114 comments:

  1. Very interesting -- I'm glad you put so much effort into thinking about this so the rest of us don't have to :).

    I'd like first to comment a bit about the ACE measure (restricted the NE tagging). Here, essentially what you do is create a bipartite graph. One half is the "true mentions" and the other half is the "system detected mentions." We then do a matching between them (ignore how for a second) and compute a final score based on this matching. For instance, you get docked some points if types don't match across links. Matches that don't overlap by some character-based minimum amount (eg., 90% of the characters) are disallowed. You also get penalized for misses (unmatched elements on the truth side) or false alarms (unmatched elements on the system side). The way the matching is actually computed is so as to MAXIMIZE your end score: this is a straightforward bipartite matching problem and can be solved efficiently.

    The major thing that I think the ACE score misses that Chris talks about is the issue of span. According to ACE, either spans overlap or they don't. 90% (IIRC) of characters in common is a match, 89.9% is not. This, for instance, would not allow for many of Chris' examples. Moreover, it doesn't fix the problem that only getting "human" in "human/DNA interleukin-2/DNA gene/DNA" is probably much worse than only getting "gene." But to know that, eg., "gene" is the head, or than last names are more important (generally) than first, we would need to actually annotate this, or come up with heuristics (likely akin to Mike Collin's head finding heuristics for head-finding in NPs).

    A second issue is believability, which Chris also mentioned in terms of "ML people react very negatively to a scoring system that you couldn't possibly write on a napkin and which involves various very arbitrary-looking constants." I think there are two reasons for this.

    (1) Complex loss functions with weird constants are just not believable. As I talked about before, we really want loss function than generalize, not that just fit the data. If we fit the data using a lot of features and weights, but this is not intuitive (our "prior" says "crazy!"), we're not going to believe it will generalize. I don't think the ACE metric is all that bad in this respect (I've seen far worse). But I have had lengthy conversations with hardcore ML people who think that even something like BLEU is too complex.

    (2) Practically speaking, complex loss functions are hard to optimize. Hamming loss is just so easy in comparison to really anything else (even F1 is hard, as is noted by the technical difficulty in the paper that brought up this conversation). Directly optimizing ACE or BLEU is quite difficult and the sorts of techniques that we (as machine learning folk) use are not really up to snuff here.

    One question -- and I don't really think this has an answer (but I'll ask it anyway) -- this brings up is: how can we tell if a loss function is good? Of course, if we can compare it to real evaluations, this is great, but what if we can't (eg., in parsing or NE tagging). Is F1 good for parsing for instance? Is the ACE metric good, or at least better than F (in some formal sense)? Is there a way to tell whether a metric is easy to game?

    Okay, a long post gets a long reply, but I'll stop here :). Thanks again to Chris for putting this together.

    ReplyDelete
  2. The "problem" Chris raises can be attributed to focusing on first-best solutions. I like to think of the problem of entity extraction as more like search and less like first-best classification (though there is clearly a deep connection between search and classification). Downstream applications typically focus on gathering/tracking information about some known entities (e.g. protein p53 and things it regulates in humans), or on mining data to discover relationships or patterns (e.g. "species=human, regulator:p53, regulated: human insuline-like growth factor II").

    Suppose you have a system that can return multiple mentions and spans from a text input and give them scores that are comparable across docs. We like to use conditional probabilities p(mention of type T from position n to n+k|text) here, because they're easy to combine with downstream processing like social network analysis and because they have the desirable cross-document comparability. For instance, here's some output drawn from LingPipe's Named Entity Tutorial (trained from NCBI's GeneTag corpus):


    p53 regulates human insulin-like growth factor II gene expression through active P4 promoter in rhabdomyosarcoma cells. Mentions by confidence: "p53":0.9999, "p4 promoter":0.7328, "insulin-like growth factor II gene":0.6055, "human insulin-like growth factor II gene":0.3817, "active P4 promoter":0.1395, "P4":0.0916, "active P4":0.0088, "insulin-like growth factor II":0.0070, "human insulin-like growth factor II":0.0044, ... The numbers are conditional probability estimates of the phrase being a gene given the input text (this is actually done by span, not by phrase, but the above is easier to read with the limited formatting available in this blog). You can see that it's just not sure if "human" should be in there, just as in Chris's example.

    A ranking of entities along with a reference gold standard allows you to draw a precision-recall curve, compute mean-average precision (MAP), precision-recall breakeven point (BEP), compute precision-at-n-documents, and so on. For instance, for genes, we can do 99.5% recall at 10% precision for search, or tighten precision down to 99% for mining applications.

    Perhaps even more importantly, we can combine information across mentions. This was originally done statistically by Mark Craven, I believe. Most simply, we can combine scores from all mentions of "P4 promoter" and estimate its total count in a corpus. This allows very high precision extraction if we take high estimated count events. We can also use this kind of output as the basis of rescoring by downstream models such as relation extractors (as doen by Dan Roth) and coreference resolvers (Heng Ji and Ralph Grishman), though that's typically done with whole-sentence n-best (which LingPipe also does) rather than per-entity n-best.

    Historical note: The whole alignment/partial-credit thing goes back to MUC. It's possible to remove the alignment part of it and just give partial credit for overlaps with the same type or exact span matches with different types and to break all that down by type and what not. But it's still not very easy to understand the final numbers. As Chris's figures show (and they're typical), most errors get partial credit, so this scoring often comes close to halving error rates. DARPA's performance figure reverse engineering is interesting from both a technical and sociological/organizational/marketing perspective.

    ReplyDelete
  3. While I agree that it is often important to not just provide a single-best output but either an n-best list or a distribution over outputs, I feel that this is something of a side issue. (Of course, if we're trying to produce probabilities, then the whole "optimizing F1" is irrelevant since we'll instead be optimizing for conditional probability.) It would of course be possible to extract n-best lists from a system trained using Suzuki et al.'s technique, but if the loss being optimized is fundamentally wrong (as Chris contests) then doing so won't help us.

    In the end (*), what we want is a loss function for NE rec such that low loss implies good inputs for systems further down the pipeline (since there are only very few imaginable scenarios in which NER is the final stage). It may turn out, as Chris contends, that conditional probability is a pretty darn good option. But whether we use an n-best list or not seems tangential to me.

    (*) Of course, it may be possible to directly optimize NER performance on the basis of final task performance, without inventing a new loss function. In fact, I hope this is true :).

    ReplyDelete
  4. Think about this when deciding whether first-best is enough: when's the last time you've pressed Google's "I'm Felling Lucky" button? A system that's limited to a first-best state-of-the-art 80-85% recall is going to miss a lot of entities. I believe a relevant task eval is the precision you can deliver at 99.5% recall. Or what recall you can get at 95% precision.

    Optimizing for first-best can actually hurt performance at different precision/recall points. Our own long-distance models are more attenuated than the more local ones; they're about 10% absolute better on F measure, but much worse on precision-at-100 docs, area-under-ROC curve, or MAP-type measures.

    Assuming that partial matches are desirable, as Chris did, assumes that downstream processes have the means to correct noisy first-best output. I think it's easier to combine and rescore than it is to correct, but that's just a hunch.

    Scoring by cross-entropy estimates of the gold standard versus a system model makes a lot of sense. The score is just the sum of the estimated log probs for the reference annotations. It works not only for NE, but for just about any task you can imagine. Not only for evaluation, but for things like EM training. It also makes sense with either joint or conditional estimates.

    The advantage of scoring by cross-entropy is that developers don't need to write decoders -- just scorers. The problem for a bakeoff scoring by cross-entropy is that it assumes systems are properly normalized probabilistic systems, which rules out SVMs, TF/IDF, heuristic pattern filters (unless they can be integrated with rescoring), etc.

    Scoring ranked n-best entities a la TREC, as I suggested, eliminates normalization errors (and cheating possibilities). But it requires ranked output, and for best performance, n-best. I believe downstream systems require n-best anyway, so this doesn't seem like much of an imposition.

    ReplyDelete
  5. I more-or-less agree that first best is not always the best way to evaluate a system, especially one that is going to be embedded within a larger system. There are many options for how to embed, the simplest of which are probably n-best lists or samples from the posterior (see Chris' paper).

    All I'm saying is that I think that the issue of F1 being suboptimal for optimization is independent of the one-best versus many distinction. If the loss you're optimizing is fundamentally broken, then you're essentially leaving it up to chance that it's somehow "close enough" that within an n-best list, you'll get something reasonable. But this is a strong assumption. I think that if you believe what Chris is saying, then you must believe it whether you are producing single best outputs or n-best outputs.

    ReplyDelete
  6. Given that I have recognized the boundaries of named entities in a large text corpora, are you aware of any unsupervised technique that classifies these Entity candidates?

    ReplyDelete
  7. Abhishek: I'd take a look at Collins and Singer's bootstrapping approach for that. (Plus things that cite it.)

    ReplyDelete
  8. I'm a little slow off the bat, but:

    I am interested in some clarifications of Chris's error classificatiion. The described segmentation method would leave some cases accounted for very strangely (though they may be rare in English NER).

    For example:

    Chrisopher|I-PER|I-PER Manning|I-PER|B-PER will be marked as a single BE, presumably. Will Chrisopher|I-PER|I-PER Manning|I-PER|I-ORG be an LBE?

    With this segmentation, if a system tags an entire sentence with one tag, or with many tags which continually overlap with gold-standard entities but whose boundaries never coincide, it will be counted as a single segment, and judged as a maximum of 1 error. Surely this is just as useless a result.

    I am basically not convinced that we have a clear way of counting boundary errors.


    Instead, perhaps, we could give a score for each correct and predicted annotation, in a way that simple cases (exact matches, boundary errors, etc.) will award two agreeing points:

    Unless/O/B-PERS Karl/B-PERS/B-PERS Smith/B-PERS/B-PERS resigns/O/O

    Here correct Karl Smith corresponds to predicted Unless Karl Smith. From the perspective of either annotation, this is a boundary error.

    Chrisopher|I-PER|I-PER Manning|I-PER|B-PER

    This might be scored as two boundary errors from the perspective of the predictions, and one (label-)boundary error from the perspective of the correct entity.

    But this scoring is biased against false positives and false negatives, and probably has a number of other faults.

    ReplyDelete
  9. When computing precision, recall, and f-measure, do you exclude non-coding regions of the text? (eg, "drove/O/O along/O/O a/O/O"?)

    Seems if you include non-coding regions, a set of evaluation documents with very few entities will produce artificially high results. However technically, a classification result of "not an entity" is still a classifier result (capable of false positive, negative, true positive, etc). Any thoughts on this? Should non-coding tokens be included or not included?

    ReplyDelete
  10. Youth is warcraft leveling not a time of life;warcraft leveling it is a wow lvl state of mind; wow power level it is not power leveling amatter of World of warcraft Power Leveling rosy cheeks, red wrath of the lich king power leveling lips and supple knees;WOTLK Power Leveling it is a matter of thewill,wlk Power Leveling a quality of buy aoc gold the imagination,aoc gold a vigor of the emotions; it is thefreshness of the deep springs wow gold of life. Youth means a tempera-mental maplestory mesos predominance of courage over timidity, of the appetite formaple story mesos adventure over the love of ease. wow gold This often existsin a man of 60 more than a boy of 20. Nobody grows old merely by anumber of years.

    ReplyDelete
  11. Do you know Rose zuly? I like it.
    My brother often go to the internet bar to buy rose zulie and play it.
    After school, He likes playing games using these rose online zuly with his friend.
    I do not like to play it. Because I think that it not only costs much money but also spend much time. One day, he give me many rose online zulieand play the game with me.
    I came to the bar following him and found Arua ROSE zulywas so cheap. After that, I also go to play game with him.

    Do you know Scions Of Fate gold? I like it.
    My brother often go to the internet bar to buy SOF gold and play it.
    After school, He likes playing games using these Scions Of Fate money with his friend.
    But I do not like to play it. Because I think that it not only costs much money but also spend much time. One day, he give me many cheap SOF goldand play the game with me.
    I came to the bar following him and found buy sof goldwas so cheap. After that, I also go to play game with him.

    ReplyDelete
  12. 酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花

    ReplyDelete
  13. After being wow gold informed of wow power leveling the problem, wow power leveling their daughter's date dog apparel said he could get the peanut out.wow power leveling With that, Wow Power Level the pilot threw open dog clothing the door and jumped from the plane.flyff power leveling the young man's Atlantica power leveling sunburn started power leveling acting up again.dog clothes He asked to be excused,dog clothes wholesale went into the kitchen power leveling The executoner said that if pet clothing this happens a second archlord power leveling time throws out a grenade and says, "i'm in the army, world of warcraft gold i can get these whenever i need them."dog clothes so they all land pet clothes safely

    ReplyDelete
  14. 艾葳酒店經紀提供專業的酒店經紀,酒店上班,酒店打工、兼職、酒店相關知識等酒店相關產業服務,想加入這行業的水水們請找專業又有保障的艾葳酒店經紀公司!
    艾葳酒店經紀是合法的公司、我們是不會跟水水簽任何的合約 ( 請放心 ),我們是不會強押水水辛苦工作的薪水,我們絕對不會對任何人公開水水的資料、工作環境高雅時尚,無業績壓力,無脫秀無喝酒壓力,高層次會員制客源,工作輕鬆。
    一般的酒店經紀只會在水水們第一次上班和領薪水時出現而已,對水水們的上班安全一點保障都沒有!艾葳酒店經紀公司的水水們上班時全程媽咪作陪,不需擔心!只提供最優質的酒店上班環境、上班條件給水水們。

    ReplyDelete
  15. I am grateful to you for this great content.aöf thanks radyo dinle cool hikaye very nice sskonlycinsellik very nice ehliyet turhoq home free kadın last go korku jomax med olsaoy hikaye lesto go müzik dinle free only film izle love aşk 09sas mp3 indir

    ReplyDelete
  16. to a scoring system that you couldn't possibly write on a napkin and which involves various very arbitrary-looking constants." I think there are two reasons for this.
    Dissertation Writing | Essay Writing | Research Paper Writing

    ReplyDelete
  17. Great post, i really appreciate it.
    Term Paper Writing | Thesis Writing

    ReplyDelete
  18. Great blog, people get lots of information keep on posting this type of attractive articles.


    custom paper guide

    college term paper

    ReplyDelete
  19. Many institutions limit access to their online information. Making this information available will be an asset to all.

    ReplyDelete
  20. Student can get term papers and custom term papers help online through many websites.

    ReplyDelete
  21. Wonderful blog, i recently come to your blog through Google excellent knowledge keep on posting you guys.

    Dissertation services
    Dissertation writing

    ReplyDelete
  22. Thank you for great article. Where else could anyone get that kind of information in such a perfect way of presentation.
    Oes Tsetnoc

    ReplyDelete
  23. this is great information that i know a lot of people are interested in.
    Kerja Keras Adalah Energi Kita | Hosting Murah | Kerja Keras Adalah Energi Kita

    ReplyDelete
  24. I like the way you explain things.

    Babies | Bayi

    ReplyDelete
  25. Nice article. I'm interested reading your well-written article

    ReplyDelete
  26. Hi,
    I personally like your post; you have shared good insights and experiences. This post will really help beginners, although it is basic but, it will help others in great deal in future. Keep it up.

    ReplyDelete
  27. I appreciate the work of all people who share information with others.

    ReplyDelete
  28. Hi,
    I personally like your post; you have shared good information.

    Buy Dissertation

    ReplyDelete
  29. Hi,
    Thank you for sharing information in the blog. You are really doing a good work. I personally like this blog and appreciates your efforts.

    Assignment Writing

    ReplyDelete
  30. Hi,
    It must've taken you a bit of time, so thanks for taking the time to do so, I appreciate it, and this post is just great.
    Coursework help

    ReplyDelete
  31. Hi,
    I greatly appreciate all the info I've read here. I will spread the word about your blog to other people. Cheers.

    ReplyDelete
  32. Thank you very much for a kind of the hottest data just about this post ! You have to ground your dissertation, I think. Because some thesis writing services do such things and you are able accomplish really good format thesis too.

    ReplyDelete
  33. Hey,
    I am so lucky that I found your blog and great articles. I will come to your blog often for finding new great articles from your blog. I am adding your RSS feed in my reader Thank you…

    Business Plan Writers

    ReplyDelete
  34. Nice post! Very complete and detail information. That’s what I need! Well done!
    Custom Term Paper

    ReplyDelete
  35. Great, I know I've wanted a larger test ACE a few times, glad to see someone created one!

    ReplyDelete
  36. When I surf the net, I come across your blog, and after I read your articles, I love your blog deeply.
    Are you a fashion chaser? And do you like polo shirts, which are very chic, especially the polo t shirts, I love them very much. I also like playing tennis rackets, it can keep healthy, what do you like to do?
    We are so honour to say that we are the outlet of
    polo t shirts women
    polo t shirts on sale
    polo t shirts for women
    polo shirts on sale
    these products are warmly welcomed by our customers. And what's more we also sell
    polo shirts men
    men's polo shirt
    men polo shirt
    mens polo shirts
    mens polo shirt
    and the high-quality
    cheap polo shirts
    discount polo shirts
    men's polo shirts
    women's polo shirts
    Our store are also famous for
    cheap tennis racket
    discount tennis racket
    and the main product is
    prince tennis racquet
    head tennis rackets
    wilson tennis racket
    babolat tennis racquet
    In this case, if you have time, you can buy many excellent things in our store, and we can assure you that we will not let you down.

    ReplyDelete
  37. This comment has been removed by the author.

    ReplyDelete
  38. This comment has been removed by the author.

    ReplyDelete
  39. that is a great kind of information peoples easily atract with it
    Thesis Writing | Dissertation Writing | Essay Writing | Assignment Writing

    ReplyDelete
  40. This comment has been removed by the author.

    ReplyDelete
  41. This comment has been removed by the author.

    ReplyDelete
  42. I am interested in is that Chris mistake classificatiion some clarification. Segmentation method will described in some cases, very strange, but they may be in English net enrollment rate of the rare.

    Thesis Help | Dissertation Help | Essay Help | Assignment Help

    ReplyDelete
  43. this kind of blog always useful for blog readers, it helps people during research. your post is one of the same for blog readers.

    Thesis Papers Writing

    ReplyDelete
  44. Good post! Very comprehensive and detailed information. This is what I need! Well done!
    Social Media Marketing Press Release Submission Social Media Tips Press Release Submission

    ReplyDelete
  45. Many institutions limit access to their online information. Making this information available will be an asset to all.

    ReplyDelete
  46. Great idea..thanks for sharing..I will forward it to my friends...
    Term Papers | Custom Essays | Research Papers

    ReplyDelete
  47. Great Article as share good stuff with good ideas and concepts, lots of great information and inspiration, both of which we all need, thanks for all the enthusiasm to offer such helpful information here
    link building service | social bookmark submission| Article Submission Service | Manually directory Submission Service | press release distribution

    ReplyDelete
  48. What a helpful post really will be coming back to this time and time again. mirc . chat . chat sohbet . mirc sohbet . cinsellik . cinsel sohbet . cinsellik sohbet . ask sohbet .Thanks ..

    ReplyDelete
  49. These kind of post are always inspiring and I prefer to read quality content Car Insurance
    so Zynga Poker Bot | Top health plans happy to find many good point here in the post, writing is simply great, thank you for the post
    nowgoogle.com adalah multiple search engine popular

    ReplyDelete
  50. A lot of great info and ideas in this post, thanks for bringing this to us!
    Jenny Parker

    ReplyDelete
  51. this kind of blog always useful for blog readers, it helps people during research. your post is one of the same for blog readers.

    Thesis paper Writers

    ReplyDelete
  52. What a helpful post really will be coming back to this time and time again.
    Logo Designs | Logo Design

    ReplyDelete
  53. polo boots
    It's all about fierce glamour with high octane gloss and lashings of sparkle as fabrics go metallic with shimmering luxe finishes. Forpolo shoes
    , gloriously excessive embellishment is absolutely key, championed at cheap herve leger outlet
    and Elie Saab. Just remember one simple rule: Too much is not enough
    Lightening bolts of acid brights emphasised by herve leger outlet
    insatiable mood for dark tones, discount herve leger 2010shake up the catwalks for an unexpected twist to the season. Flashes of fuchsia, and minimalist cobalt come in the form of newest herve leger and statement dresses for a bold, dynamic fashion direction.

    ReplyDelete
  54. Designer purses and handbags cheap LV bagsTommy and Kate handbagsdiscount Gucci bags you wear it to work with friends newest Dooney&Bourke handbags
    Kate is a designer labelChanel handbags
    outlet
    Tommy and Kate handbags Burberry handbags 2010
    Shox Technology Nike Nike air jordan
    2010
    The effect is more traction basketball air jordan shoesLunarLite cheap Jordan shoes outlet Zoom Airjordan
    Depending on the application discount
    jordan shoes
    maintaining extreme responsivenessjordan 2010Flywire
    This is one of the newer technologies introduced by Nike. jumpman23to ensure that the foot is kept in place.
    Timberland shoes or boots
    cheap timberlandSince the Timberland
    branddiscount timberlands
    These boots are specially timberland
    Boots
    How we tall them apart?cheap
    timberland uk
    But on the other hand there are usually quite few real Timberland
    discount timberland outlet lines
    being sold for a fraction of the price. Sunglasses discounttimberland outlet are not limited
    for the summer season alone. This is basically because sunglasses newest Gucci sunglasses
    Here are five simple steps discount coach
    sunglasses

    In Marketing, discount Oakley sunglasses
    Now because of customized sunglassRay Ban
    sunglasses outlet
    racks, unreclaimed creature, without refinement,
    Nike shoesair max nike
    in 1987 first time. Since then Nike newest air
    max
    has been frequently introducing new as well as updated models in it.cheap air max shoesJust because air max Clearly, consumers are just as happy as athletes.air max nike Also on thisJordan ajf shoes pair they Moncler,orMoncler jackets,maybe you like Moncler coats,discount Moncler Vest,you can choose someMoncler outletandmoncler polo t-shirt

    ReplyDelete
  55. Designer purses and handbags cheap LV bagsTommy and Kate handbagsdiscount Gucci bags you wear it to work with friends newest Dooney&Bourke handbags
    Kate is a designer labelChanel handbags
    outlet
    Tommy and Kate handbags Burberry handbags 2010
    Shox Technology Nike Nike air jordan
    2010
    The effect is more traction basketball air jordan shoesLunarLite cheap Jordan shoes outlet Zoom Airjordan
    Depending on the application discount
    jordan shoes
    maintaining extreme responsivenessjordan 2010Flywire
    This is one of the newer technologies introduced by Nike. jumpman23to ensure that the foot is kept in place.
    Timberland shoes or boots
    cheap timberlandSince the Timberland
    branddiscount timberlands
    These boots are specially timberland
    Boots
    How we tall them apart?cheap
    timberland uk
    But on the other hand there are usually quite few real Timberland
    discount timberland outlet lines
    being sold for a fraction of the price. Sunglasses discounttimberland outlet are not limited
    for the summer season alone. This is basically because sunglasses newest Gucci sunglasses
    Here are five simple steps discount coach
    sunglasses

    In Marketing, discount Oakley sunglasses
    Now because of customized sunglassRay Ban
    sunglasses outlet
    racks, unreclaimed creature, without refinement,
    Nike shoesair max nike
    in 1987 first time. Since then Nike newest air
    max
    has been frequently introducing new as well as updated models in it.cheap air max shoesJust because air max Clearly, consumers are just as happy as athletes.air max nike Also on thisJordan ajf shoes pair they Moncler,orMoncler jackets,maybe you like Moncler coats,discount Moncler Vest,you can choose someMoncler outletandmoncler polo t-shirt

    ReplyDelete
  56. Nice post, thanks for sharing this wonderful and useful information with us.

    ReplyDelete
  57. MBT will not only change,MBT boots, the way you use your musclesMBT Shoes in fashion, but will improve the use of your joints and spineCheap MBT Shoes sale, The uniquely designed sole combined withDiscount MBT outlet 2010, correct training achieves a more active andMBT Walking Shoes, healthy posture and walk. What are the benefits?Cheap Dior shoes, Free shipping and free return shippingDiscount Dior bags, on all dior. Included with each pair of isDior sunglasses, an instructional DVD. What is Dior?Dior sunglasses outlet 2010, Technology or dior is not just a shoe in the ordinary sense ?Newest Dior sunglasses,
    Do not worry. The urge to buy these goods will be very strong once you spot the shoes or clothes that you likeDiscount New Balance shoes,revolutionary fitness aid from Swiss Masai,New Balance 580 outlet, which may help reduce cellulite andCheap New Balance 850, Relieves muscular tension back and joint problems Leads to a relaxed upright,New Balance running shoes, Through its unique design of its multilayered soleNew Balance shoes 2010, transforms flat hard artificial surfaces into natural uneven surfaces theDiscount PUMA Outlet, with top quality and cheap price. Cheap PUMA running shoes, innovative sole design includes thePuma sneaker 2010, curved sole which is theKids Puma Shoes, is good for our foot
    Puma Walking Shoes, back ,nice,good and knee is Puma Shoes. Here you can buy wide rangewholesale cl high heel sandals,quality and cheap car GPS navigation systemsMoncler,Very Cool, Comfortable and lightmoncler jackets original packing you can rest assured.moncler coats,You might say that thediscount moncler vestNo one ever thought bothmoncler outlet,As with everything that comes fmonmoncler t-shirtThe new store has been decorated

    ReplyDelete
  58. No matter what product you purchase from us north face jacketsWe are offering you a wide range ofnorth face outletquestion,Some color combinations seem to never get old north face outerwearBut within the same community north face coatsbecause it features just the right amount of north face uk,look at another good paoduct such as Dior totes,A little of these are given below.ugg bootswas a very well-known French fashionable boot
    cheap ugg boots,because of the wisdom of brilliant featuresdiscount ugg boots,which you are buying is unique and original classic ugg boots,When possible, they obtain materials from domestic suppliers ugg classic tall boots,which makes it exclusive and uniquebabyliss,I like the other two designs as wel Benefit GHD,A stroll around the park with the GHD IV Salon Styler,They're also used to buy GHD Mini Styler,A stroll around the park with GHD Precious gift,They're also used to help prevent
    GHD Rare Styler,Following the success of the initial gray ghd,The extremity of the sole is gold ghd
    And those who buy this ghd Instyler,people who work outside and so wearisome Kiss ghd,I would recommend moving up to pink ghd,If your own walks or intentions for this pure black ghd,I wouldn’t hesitate to recommend the pure white ghd,our price is very reasonable purple ghd

    ReplyDelete
  59. Fashion trends change on daily basis, like Gold GHD. Following the latest in designer shades has become a passion of everyone, now Burberry Sunglasses. If you are the type of a woman who loves to explore in fashion, our ED Hardy Sunglasses will definitely satisfy your taste. Cheap Ed Hardy Sunglasses is also OK. Ed hardy streak of clothing is expanded into its wholesale ED Hardy T-shirt chain so that a large number of fans and users can enjoy the cheap ED Hardy Clothing range easily with the help of numerous secured websites, actually, our ED Hardy Outlet. As we all know, in fact discount ED Hardy, is based on the creations of the world renowned tattoo artist Don Ed Hardy. Well, this question is bound to strike the minds of all individuals. Many people may say Prada shoes is a joke, but we can give you Prada Sunglasses, because we have Prada handbags. Almost everyone will agree that Prada Purses are some of the most beautiful designer handbags marketed today. Now we have one new product: Prada totes. The reason is simple: fashion prohibited by ugg boots, in other words, we can say it as Cheap ugg boots. Would you like to wear Discount ugg boots. We have two kinds of fashionable boots: classic ugg boots and ugg classic tall boots. Ankh Royalty--the Cultural Revolution. Straightens out the collar, the epaulette epaulet, the Ankh Royalty Clothing two-row buckle. Now welcome to our Ankh Royalty Outlet. And these are different products that bear the most famous names in the world of fashion, like Ankh Royalty T shirt by the way -Prada, Spyder, Moncler(Moncler jackets,or you can say Moncler coats, Moncler T-shirt, Moncler vest,and you can buy them from our discount Moncler outlet), GHD, ED Hardy, Ankh Royalty, Twisted Heart.

    ReplyDelete
  60. it's good to see this information in your post, i was looking the same but there was not any proper resource, thanx now i have the link which i was looking for my research.

    UK Dissertations Help

    ReplyDelete
  61. Everyone is familiar with the F1 measure for simple classification decisions.
    cara meninggikan badanYou draw a 2x2 contingency table of whether something should be yes/no, and whether the system guessed yes/no, and then calculate the harmonic mean of precision and recall.

    ReplyDelete
  62. Among ML-oriented nlpers, using a simple F1 of precision and recall is the standard way to evaluate Named Entity Recognition. Using F1 seems familiar and comfortable, but I think most nlpers haven't actually thought through the rather different character that the F1 measure takes on when applied to evaluating sequence models. tinggi badan

    ReplyDelete
  63. Wonderful post, thanks for putting this together! "This is obviously one great post. Thanks for the valuable information and insights you have so provided here. Keep it up!"
    Dissertation Help | Custom Dissertation

    ReplyDelete
  64. As a newbie, this article was really helpful, thanks for sharing!

    ReplyDelete
  65. Nice information, many thanks to the author. It is incomprehensible to me now, but in general, the usefulness and significance is overwhelming. Thanks again and good luck!

    Term papers

    ReplyDelete
  66. NewStreetFashion
    Ed Hardy
    stylish design
    Ed Hardy Wholesale
    fashion excellent quality
    wholesale Ed Hardy
    ED Hardy clothing bring you a super surprise!
    ed hardy wholesale clothing
    The quality is so good
    christian audigier
    Young and creative style
    abercrombie and fitch
    You can have a look at it.
    abercrombie & fitch
    jordan 8
    jordan 9
    jordan 10

    ReplyDelete
  67. What a great style. Very informative one, I hope you will continue your research.
    I can offer you term paper
    on this subject. Thank you.

    ReplyDelete
  68. It's an interesting approach. I usually see ordinary views on the subject but yours it's written in a pretty special manner. Sure enough, I will revisit your web site for more information.

    Term papers

    ReplyDelete
  69. We offer the Farouk Chi Flat Iron. We provide the best price and free shipping for all the
    chi flat iron. As we know, the
    ghd iv styler is the first class and famous brand. So it is the good chance for you. Don't let it pass. If you are looking for the
    babyliss flat iron, you have come to the right place for
    instyler rotating hot iron.


    GHD straighteners was known as
    ghd flat iron, which was authorized online
    GHD seller provides all kinds of hair straighteners,pink ghd,purple ghd,babyliss. By visiting
    ghd iv salon styler , you will find what you want and made yourself more beautiful.If you miss it ,you miss beauty.Buy a piece of ghd for yourself.Come and join us
    http://www.ghdhairs.com/ to win the
    ghd iv mini styler.
    ghd uk
    ghd australia
    ghd africa
    ghd southafrica
    t3 hair dryer
    purple ghd straighteners
    ghd spain
    ghd ireland
    ghd denmark
    ghd america
    ghd italy
    ghd germany
    ghd france
    ghds
    cheap ghd
    purple ghd straighteners

    ReplyDelete
  70. Young and creative style.
    abercrombie and fitch
    abercrombie & fitch
    You can have a look at it.
    Abercrombie and fitch outlet
    ED Hardy clothing bring you a super surprise!
    ed hardy wholesale clothing
    If you really want it.
    nike outlet

    ReplyDelete
  71. This is really a nice blog, I appreciate you for telling us so nice things, thank you!By the way, if you like nike tn you can come here to choose! We have a lot of
    nike tn,tn chaussures,
    nike tn requin chaussures,nike air max tn chaussures.
    If you want to find the shoes according to the sorts, then here you can have the informations,
    we classied the shoes in nike presto,
    nike air max,
    vibram fivefingers,
    converse.
    At the same time, the vibram also offer you in our store.
    You also can choose the most fashionable sunglasses here, it really can make you different from other people. We have
    sunglasses,designer sunglasses,
    wholesale sunglasses,sunglasses discount in USA.
    They includ men's sunglasses,women's sunglasses.
    So many fashion brands are for you,like Dior Sunglasses,
    Emporio Armani Sunglasses,
    Fendi Sunglasses,
    Giorgio Armani Sunglasses,
    Gucci Sunglasses,
    LV Sunglasses and so on.

    ReplyDelete
  72. Then there are sequences (of one or more tokens) where there was an entity and the system guessed it right (in/O/O Palo/LOC/LOCAlto/LOC/LOC ./O/O), where there was an entity but the system missed it (in/O/O Palo/LOC/O Alto/LOC/O ./O/O), and where there wasn't an entity but the system hypothesized one (an/O/O Awful/O/ORG Headache/O/ORG ./O/O).peninggi badan

    ReplyDelete
  73. The blogger is Huge network for blogging i get lots of interesting information from here, hope blogger will modify and increase attributes to make it simpler.

    Dissertation Writing Help | Dissertation Structure

    ReplyDelete
  74. This is the first time i came to know about this blog and found it amazing.

    Custom Logo Design

    ReplyDelete
  75. Great article..! i like the way of your writing and description.

    Business Logo

    ReplyDelete
  76. Thanks very much for your suggestion.I can get a lot of information from you article.And there is also so much nice jackets for all of you,i hope you like them.
    moncler
    moncler jacken
    moncler jackets
    moncler men
    moncler coats
    moncler women
    Thanks for you attention.

    ReplyDelete
  77. This might be scored as two boundary errors from the perspective of the predictions, and one (label-)boundary error from the perspective of the correct entity.
    website design | web design | free website design | flyer design

    ReplyDelete
  78. This kind of information is very limited on internet. Nice to find the post related to my searching criteria. Your updated and informative post will be appreciated by blog loving people.
    dissertation help|thesis help|assignment help|coursework writing|research writing|essay help

    ReplyDelete
  79. When I see your article, I really agree with you about the blog.I think people will know this after read the information. I hope you will share more with us. At the same time, you also can go to our website to find someting that maybe you like. We have
    nike chaussures,nike shox chaussures,
    nike tn,nike tn requin
    nike air max chaussures,nike chaussures femmes.
    nike chaussures homme,nike chaussures enfants
    We have so many kinds of nike shoes that we are sure you will find the one that you like. Besides, we have the special
    MBT chaussures.
    If you like climbing then you can choose the vibram chaussures in our store.
    You can find the Y-3 Yohji Yamamotoand
    Nike Air Jordan are designed for you!

    ReplyDelete
  80. Wonderful blog, i recently come to your blog through Google excellent knowledge keep on posting you guys,thanks for sharing this post.



    Hughesnet Broadband

    ReplyDelete
  81. Your blog is so nice.I am impressed with your vivid expression.I will
    vuitton
    handbags
    neverfull | louis vuitton mahina
    bookmarked you…keep up the good work!!!!

    ReplyDelete
  82. i like your blog so much its my honor to give my comments on it great work man....

    ReplyDelete
  83. I am very much pleased with the contents you have mentioned. I enjoyed every little bit part of it. It contains truly information. I want to thank you for this informative read; I really appreciate sharing this great.

    ReplyDelete