How to check the token is the last token in the string in JAPE Rule


Dinesh
 

I want to check if "x1" is appearing at the end of the document string in the JAPE Rule. Is there any way/method to identify if the token is the last token of the document?


Anusha
 

Hi

You can take Token(your particular string)  as input and check whether that token is last token in the document  using this code,if at all your token is the last one "LastToken " annotation will be created.

lhs:

{Token.string=="x1"}

rhs:

 (AnnotationSet tokAnnots1=bindings.get("lhs"));


for(Annotation tokAnn1: tokAnnots1){
if(end(tokAnn1).longValue()==doc.getContent().size().longValue()){

FeatureMap features=Factory.newFeatureMap();
features.put("string",tokAnn1.getFeatures().get("string").toString());

 outputAS.add(start(tokAnn1),end(tokAnn1),"LastToken",features);
}

Regards

Anusha Annapragada


On 11-04-2022 11:33, Dinesh Zende wrote:

I want to check if "x1" is appearing at the end of the document string in the JAPE Rule. Is there any way/method to identify if the token is the last token of the document?


Dinesh
 
Edited

This is my code
------------------------------------------------------------------------------------------------------------------------------------

Phase: LastQuantity
Input: Token SpaceToken 
Options: control = Appelt
 
Rule: LastQuantity
Priority: 50
(
({ Token.kind == number}):Label
    ({ Token.string == "x"})
 
)
-->
:Label{
     AnnotationSet tokAnnots1=bindings.get("lhs");
     for(Annotation tokAnn1: tokAnnots1)
     {
         if(end(tokAnn1).longValue()==doc.getContent().size().longValue())
         {
             FeatureMap features=Factory.newFeatureMap();
             features.put("string",tokAnn1.getFeatures().get("string").toString());
             outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);
        }
     }
    }
--------------------------------------------------------------------------------------------------------------------------------
And I am getting the error as follows:
Error: The method end(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 27 in japeactionclasses.LastQuantityLastQuantityActionClass887
Error: The method start(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 31 in japeactionclasses.LastQuantityLastQuantityActionClass887
Error: The method end(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 31 in japeactionclasses.LastQuantityLastQuantityActionClass887


Ian Roberts
 

There are several problems with this code, it looks like you've tried to combine elements of several different examples.

To fix the compile error, note that start and end are methods of gate.Utils, so you need to add this at the top of your JAPE file above the "Phase:" line

Imports: {
  import static gate.Utils.*;
}

But that won't make the rule work as there's a number of logic errors in there as well.  What problem are you actually trying to solve here?  Your original question was very brief - do you want to see if "x1" is the very last two characters in the whole document, or just the last two Tokens (yes, two tokens, the standard tokeniser will treat "x1" as one token for the "x" and another for the "1"), possibly followed by whitespace or other non-Token annotations?

Ian

On 04/05/2022 13:02, Dinesh wrote:
This is my code
------------------------------------------------------------------------------------------------------------------------------------Phase: LastQuantity
Input: Token SpaceToken 
Options: control = Appelt
 
Rule: LastQuantity
Priority: 50
(
({ Token.kind == number}):Label
    ({ Token.string == "x"})
 
)
-->
:Label{
     AnnotationSet tokAnnots1=bindings.get("lhs");
     for(Annotation tokAnn1: tokAnnots1)
     {
         if(end(tokAnn1).longValue()==doc.getContent().size().longValue())
         {
             FeatureMap features=Factory.newFeatureMap();
             features.put("string",tokAnn1.getFeatures().get("string").toString());
             outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);
        }
     }
    }
--------------------------------------------------------------------------------------------------------------------------------
And I am getting the error as follows:
Error: The method end(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 27 in japeactionclasses.LastQuantityLastQuantityActionClass887
Error: The method start(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 31 in japeactionclasses.LastQuantityLastQuantityActionClass887
Error: The method end(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 31 in japeactionclasses.LastQuantityLastQuantityActionClass887


-- 
Ian Roberts               | Department of Computer Science
i.roberts@...  | University of Sheffield, UK


Dinesh
 

Thank you very much Ian for your quick reply.
Actually, I am looking for "1x" which are the last two tokens (as quantity) specified in the string. (or any number followed by "x"). The tokenizer detects the number and the "x" as separate tokens.
I have tried the code given by Anusha in the thread. Could not able to resolve the end() and start() method. I am referring  https://gate.ac.uk/sale/tao/splitch8.html resource.
 


Dinesh
 

After adding the required imports now it is throwing "InvalidOffsetException" for "outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);" I have caught that exception, Now the code is running without any error/exception but no output/annotation is resulted.
Please find the code below:
-------------------------------------------------------------------------

Imports: {
  import static gate.Utils.*;
}
 
Phase: LastQuantity
Input: Token SpaceToken 
Options: control = Appelt
 
Rule: LastQuantity
Priority: 50
(
({ Token.kind == number}):Label
    ({ Token.string == "x"})
 
)
-->
:Label{
     AnnotationSet tokAnnots1=bindings.get("Label");
     for(Annotation tokAnn1: tokAnnots1)
     {
         if(end(tokAnn1).longValue()==doc.getContent().size().longValue())
         {
             FeatureMap features=Factory.newFeatureMap();
             features.put("string",tokAnn1.getFeatures().get("string").toString());
             try
             {
                outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);  
             }
             catch(InvalidOffsetException e) {
                 throw new JapeException(e);
             }
             
        }
     }
    }
 


Ian Roberts
 

if(end(tokAnn1).longValue()==doc.getContent().size().longValue())

checks whether the {Token.kind == number} is at the end of the document, which it will never be as it's always followed by the "x".

The problem you're describing is actually very hard to achieve in a single JAPE rule because it doesn't have a way to specifically target the last annotation of a particular kind.  Personally I'd probably use a different tool (probably the Groovy plugin), but if it must be JAPE then I'd do three phases

Phase: AnnotateX
Input: Token
Options: control = all

Rule: FindX
({Token.string == "x"}):x
-->
:x.TempX = {}

Phase: LastX
Input: Token TempX
Options: control = appelt

Rule: DeleteNonFinalX
(
  ({TempX}):x
  {Token}
)
-->
:x {
  inputAS.removeAll(xAnnots);
}

Phase: LastQuantity
Input: Token TempX
Options: control = appelt

Rule: Quantity
(
  ({ Token.kind == number}):qty
  {TempX}
)
-->
:qty.LastQuantity = {string = @qty.Token.string}

The first phase annotates all "x" tokens with a temporary annotation, the second phase deletes all TempX annotations that are followed by another token (leaving a TempX only if the last token in the whole document is "x"), the third phase looks for the number followed by TempX, which thanks to the second phase can only ever be at the end.

Ian

On 05/05/2022 04:52, Dinesh wrote:
After adding the required imports now it is throwing "InvalidOffsetException" for "outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);" I have caught that exception, Now the code is running without any error/exception but no output/annotation is resulted.
Please find the code below:
-------------------------------------------------------------------------

Imports: {
  import static gate.Utils.*;
}
 
Phase: LastQuantity
Input: Token SpaceToken 
Options: control = Appelt
 
Rule: LastQuantity
Priority: 50
(
({ Token.kind == number}):Label
    ({ Token.string == "x"})
 
)
-->
:Label{
     AnnotationSet tokAnnots1=bindings.get("Label");
     for(Annotation tokAnn1: tokAnnots1)
     {
         if(end(tokAnn1).longValue()==doc.getContent().size().longValue())
         {
             FeatureMap features=Factory.newFeatureMap();
             features.put("string",tokAnn1.getFeatures().get("string").toString());
             try
             {
                outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);  
             }
             catch(InvalidOffsetException e) {
                 throw new JapeException(e);
             }
             
        }
     }
    }
 


-- 
Ian Roberts               | Department of Computer Science
i.roberts@...  | University of Sheffield, UK


Dinesh
 

Thank you very much Ian!
That really works!!! 
I did a little adjustment as follows in the last Phase as it was raising exceptions for "@"  [gate.creole.ResourceInstantiationException: gate.jape.parser.ParseException:]

----------------------------------------------------------
Phase: LastQuantity
Input: Token TempX LastX
Options: control = appelt
 
Rule: Quantity
(
  ({ Token.kind == number}):qty
  {TempX}
)
-->
:qty.Quantity = {Rule=Quantity}
------------------------------------------------------------------------------------------

And now the last number token before "x" is perfectly annotating as 'Quantity'. Thank you once again. 
If possible can you please explain the logic for "LastX" for the GATE community?


Ian Roberts
 

Oh yeah, sorry, I was writing from memory and didn't test my own code - that should have been

:qty.LastQuantity = {string = :qty.Token.string}

(colon instead of @).

Ian

On 05/05/2022 12:02, Dinesh wrote:
Thank you very much Ian!
That really works!!! 
I did a little adjustment as follows in the last Phase as it was raising exceptions for "@"  [gate.creole.ResourceInstantiationException: gate.jape.parser.ParseException:]

----------------------------------------------------------
Phase: LastQuantity
Input: Token TempX LastX
Options: control = appelt
 
Rule: Quantity
(
  ({ Token.kind == number}):qty
  {TempX}
)
-->
:qty.Quantity = {Rule=Quantity}
------------------------------------------------------------------------------------------

And now the last number token before "x" is perfectly annotating as 'Quantity'. Thank you once again. 
If possible can you please explain the logic for "LastX" for the GATE community?


-- 
Ian Roberts               | Department of Computer Science
i.roberts@...  | University of Sheffield, UK


Ian Roberts
 

On 05/05/2022 12:02, Dinesh wrote:
If possible can you please explain the logic for "LastX" for the GATE community?
In my example LastX is the name of the phase, not a type of annotation - the LastX phase doesn't create any new annotations, it just deletes some of the ones created in phase 1.

The Input line for the final phase should just be Token TempX

Ian

--
Ian Roberts | Department of Computer Science
i.roberts@... | University of Sheffield, UK