Date
1 - 10 of 10
How to check the token is the last token in the string in JAPE Rule
Dinesh
I want to check if "x1" is appearing at the end of the document string in the JAPE Rule. Is there any way/method to identify if the token is the last token of the document?
|
|
Anusha
Hi You can take Token(your particular string) as input and check whether that token is last token in the document using this code,if at all your token is the last one "LastToken " annotation will be created. lhs: {Token.string=="x1"} rhs:
outputAS.add(start(tokAnn1),end(tokAnn1),"LastToken",features); Regards Anusha Annapragada
On 11-04-2022 11:33, Dinesh Zende
wrote:
I want to check if "x1" is appearing at the end of the document string in the JAPE Rule. Is there any way/method to identify if the token is the last token of the document? |
|
This is my code
------------------------------------------------------------------------------------------------------------------------------------ Phase: LastQuantity Input: Token SpaceToken
Options: control = Appelt
Rule: LastQuantity
Priority: 50
(
({ Token.kind == number}):Label
({ Token.string == "x"})
)
-->
:Label{
AnnotationSet tokAnnots1=bindings.get("lhs");
for(Annotation tokAnn1: tokAnnots1)
{
if(end(tokAnn1).longValue()==doc.getContent().size().longValue())
{
FeatureMap features=Factory.newFeatureMap();
features.put("string",tokAnn1.getFeatures().get("string").toString());
outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);
}
}
}
-------------------------------------------------------------------------------------------------------------------------------- And I am getting the error as follows: Error: The method end(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 27 in japeactionclasses.LastQuantityLastQuantityActionClass887
Error: The method start(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 31 in japeactionclasses.LastQuantityLastQuantityActionClass887
Error: The method end(Annotation) is undefined for the type LastQuantityLastQuantityActionClass887 at line 31 in japeactionclasses.LastQuantityLastQuantityActionClass887
|
|
Ian Roberts
There are several problems with this
code, it looks like you've tried to combine elements of several
different examples.
To fix the compile error, note that
start and end are methods of gate.Utils, so you need to add this
at the top of your JAPE file above the "Phase:" line
Imports: {
import static
gate.Utils.*;
}
But that won't make the rule work
as there's a number of logic errors in there as well. What
problem are you actually trying to solve here? Your original
question was very brief - do you want to see if "x1" is the very
last two characters in the whole document, or just the
last two Tokens (yes, two tokens, the standard tokeniser
will treat "x1" as one token for the "x" and another for the "1"),
possibly followed by whitespace or other non-Token annotations?
Ian
On 04/05/2022 13:02, Dinesh wrote:
This is my code
-- Ian Roberts | Department of Computer Science i.roberts@... | University of Sheffield, UK |
|
Dinesh
Thank you very much Ian for your quick reply.
Actually, I am looking for "1x" which are the last two tokens (as quantity) specified in the string. (or any number followed by "x"). The tokenizer detects the number and the "x" as separate tokens. I have tried the code given by Anusha in the thread. Could not able to resolve the end() and start() method. I am referring https://gate.ac.uk/sale/tao/splitch8.html resource. |
|
Dinesh
After adding the required imports now it is throwing "InvalidOffsetException" for "outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);" I have caught that exception, Now the code is running without any error/exception but no output/annotation is resulted.
Please find the code below: ------------------------------------------------------------------------- Imports: {
import static gate.Utils.*;
}
Phase: LastQuantity
Input: Token SpaceToken
Options: control = Appelt
Rule: LastQuantity
Priority: 50
(
({ Token.kind == number}):Label
({ Token.string == "x"})
)
-->
:Label{
AnnotationSet tokAnnots1=bindings.get("Label");
for(Annotation tokAnn1: tokAnnots1)
{
if(end(tokAnn1).longValue()==doc.getContent().size().longValue())
{
FeatureMap features=Factory.newFeatureMap();
features.put("string",tokAnn1.getFeatures().get("string").toString());
try
{
outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);
}
catch(InvalidOffsetException e) {
throw new JapeException(e);
}
}
}
}
|
|
Ian Roberts
if(end(tokAnn1).longValue()==doc.getContent().size().longValue())
checks whether the {Token.kind ==
number} is at the end of the document, which it will never be as
it's always followed by the "x".
The problem you're describing is
actually very hard to achieve in a single JAPE rule because it
doesn't have a way to specifically target the last annotation of a
particular kind. Personally I'd probably use a different tool
(probably the Groovy plugin), but if it must be JAPE then
I'd do three phases
Phase: AnnotateX
Input: Token
Options: control
= all
Rule: FindX
({Token.string
== "x"}):x
-->
:x.TempX = {}
Phase: LastX
Input: Token
TempX
Options: control
= appelt
Rule:
DeleteNonFinalX
(
({TempX}):x
{Token}
)
-->
:x {
inputAS.removeAll(xAnnots);
}
Phase:
LastQuantity
Input: Token
TempX
Options: control
= appelt
Rule: Quantity
(
({ Token.kind
== number}):qty
{TempX}
)
-->
:qty.LastQuantity
= {string = @qty.Token.string}
The first phase annotates all "x"
tokens with a temporary annotation, the second phase deletes all
TempX annotations that are followed by another token (leaving a
TempX only if the last token in the whole document is
"x"), the third phase looks for the number followed by TempX,
which thanks to the second phase can only ever be at the end.
Ian
On 05/05/2022 04:52, Dinesh wrote:
After adding the required imports now it is throwing "InvalidOffsetException" for "outputAS.add(start(tokAnn1),end(tokAnn1),"LastQuantity",features);" I have caught that exception, Now the code is running without any error/exception but no output/annotation is resulted.
-- Ian Roberts | Department of Computer Science i.roberts@... | University of Sheffield, UK |
|
Dinesh
Thank you very much Ian!
That really works!!! I did a little adjustment as follows in the last Phase as it was raising exceptions for "@" [gate.creole.ResourceInstantiationException: gate.jape.parser.ParseException:] ---------------------------------------------------------- Phase: LastQuantity
Input: Token TempX LastX
Options: control = appelt
Rule: Quantity
(
({ Token.kind == number}):qty
{TempX}
)
-->
:qty.Quantity = {Rule=Quantity}
------------------------------------------------------------------------------------------ And now the last number token before "x" is perfectly annotating as 'Quantity'. Thank you once again. If possible can you please explain the logic for "LastX" for the GATE community?
|
|
Ian Roberts
Oh yeah, sorry, I was writing from
memory and didn't test my own code - that should have been
:qty.LastQuantity
= {string = :qty.Token.string}
(colon instead of @).
Ian
On 05/05/2022 12:02, Dinesh wrote:
Thank you very much Ian!
-- Ian Roberts | Department of Computer Science i.roberts@... | University of Sheffield, UK |
|
Ian Roberts
On 05/05/2022 12:02, Dinesh wrote:
If possible can you please explain the logic for "LastX" for the GATE community?In my example LastX is the name of the phase, not a type of annotation - the LastX phase doesn't create any new annotations, it just deletes some of the ones created in phase 1. The Input line for the final phase should just be Token TempX Ian -- Ian Roberts | Department of Computer Science i.roberts@... | University of Sheffield, UK |
|