-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathLow-Level Text.i7x
779 lines (582 loc) · 34.8 KB
/
Low-Level Text.i7x
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
Version 2 of Low-Level Text (for Glulx only) by Brady Garvin begins here.
[@]
"Mutable text for situations where Inform's indexed text isn't an option."
Include Runtime Checks by Brady Garvin.
Include Low-Level Operations by Brady Garvin.
Use authorial modesty.
Book "Copyright and License"
[Copyright 2013 Brady J. Garvin]
[This extension is released under the Creative Commons Attribution 3.0 Unported License (CC BY 3.0) so that it can qualify as a public Inform extension. See the LICENSE file included in the release for further details.]
Book "Extension Information"
[@]
[Inform already has mutable text in the form of indexed text. But for the Glulx Runtime Instrumentation Framework we need to worry about:
1. Reentrancy---if we inject a call into the middle of a block value management routine (which we will), the callee can't be sure that the block value record keeping is in a consistent state.
and
2. Speed---the interpreter-provided malloc is usually much faster than Inform's non-native, space-conscious equivalent.
So we use our own implementation. However, the adjective ``low-level'' is important: synthetic text is more awkward and less forgiving than the nicely encapsulated indexed text. It is suitable for use in the instrumentation framework and instrumentation extensions because that's what it was designed for. It might be appropriate in some other obscure situations. But it is not fit for general use in a story. (Besides, it doesn't support non-Latin-1 characters.)]
Book "Runtime Checks"
Chapter "Environment Checks"
An environment check rule (this is the check for dynamic memory allocation to support synthetic text rule):
always check that memory allocation is supported or else say "[low-level runtime failure in]Low-Level Text[with explanation]This story uses low-level text, which in turn depends on dynamic memory allocation. But this interpreter doesn't allow dynamic memory allocation, meaning that the story cannot safely run.[terminating the story]".
Book "Synthetic Text"
Chapter "The Synthetic Text Structure" - unindexed
[@@]
[Layout:
4 bytes padding (at offset -8 bytes)
4 bytes for the length (at offset -4 bytes)
1 byte for Glulx's marker for uncompressed Latin-1 text
length bytes for the characters
1 byte for the null terminator]
Chapter "Construction and Destruction of Synthetic Text"
[@@]
To decide what text is a new uninitialized synthetic text with length (L - a number) character/characters (this is creating uninitialized synthetic text by length):
let the result be a memory allocation of L plus ten bytes;
increase the result by four;
write the integer L to address result;
increase the result by four;
write the byte 224 to address result;
write the byte zero to address result plus L plus one;
decide on the result converted to some text.
[@@]
To decide what text is a new uninitialized permanent synthetic text with length (L - a number) character/characters (this is creating uninitialized permanent synthetic text by length):
let the result be a permanent memory allocation of L plus ten bytes;
increase the result by four;
write the integer L to address result;
increase the result by four;
write the byte 224 to address result;
write the byte zero to address result plus L plus one;
decide on the result converted to some text.
[@@]
[We assume that T doesn't change length, for example because of substitutions' side-effects.]
To decide what text is a new synthetic text copied from (T - some text) (this is copying text to synthetic text):
let the length be the length of T;
let the metrics structure address be a memory allocation of length plus ten bytes;
let the result be the metrics structure address plus eight;
write the byte 224 to address result;
print the text T to the Latin-1 array at address result plus one with length length and metrics structure at address metrics structure address;
write the byte zero to address result plus the length plus one;
decide on the result converted to some text.
[@@]
[We assume that T doesn't change length, for example because of substitutions' side-effects.]
To decide what text is a new permanent synthetic text copied from (T - some text) (this is copying text to permanent synthetic text):
let the length be the length of T;
let the metrics structure address be a permanent memory allocation of length plus ten bytes;
let the result be the metrics structure address plus eight;
write the byte 224 to address result;
print the text T to the Latin-1 array at address result plus one with length length and metrics structure at address metrics structure address;
write the byte zero to address result plus the length plus one;
decide on the result converted to some text.
To decide what text is a new synthetic text extracted from the (N - a number) bytes at address (A - a number) (this is creating synthetic text from an array):
let the result be a new uninitialized synthetic text with length N characters;
copy N bytes from address A to address the character array address of the synthetic text result;
decide on the result.
To decide what text is a new permanent synthetic text extracted from the (N - a number) bytes at address (A - a number) (this is creating permanent synthetic text from an array):
let the result be a new uninitialized permanent synthetic text with length N characters;
copy N bytes from address A to address the character array address of the synthetic text result;
decide on the result.
[The result is taken to be the text between the first occurrence of P and the first subsequent occurrence of S in T (all of which are synthetic), returned as a new synthetic text. If there is no such text, this phrase returns the interned empty string.]
To decide what text is a new synthetic text extracted from the synthetic text (T - some text) between the synthetic prefix (P - some text) and the synthetic suffix (S - some text) or the interned empty string if there is no match (this is creating synthetic text by extraction):
let the beginning index be the index of the synthetic text P in the synthetic text T;
if beginning index is zero:
decide on "";
increase the beginning index by the length of the synthetic text P minus one;
let the end index be the index of the synthetic text S in the synthetic text T starting after index beginning index;
if the end index is zero:
decide on "";
let the length be the end index minus the beginning index;
decrement the length;
let the array address be the character array address of the synthetic text T plus the beginning index;
decide on a new synthetic text extracted from the length bytes at address array address.
[The result is taken to be the text between the first occurrence of P and the first subsequent occurrence of S in T (all of which are synthetic), returned as a new permanent synthetic text. If there is no such text, this phrase returns the interned empty string.]
To decide what text is a new permanent synthetic text extracted from the synthetic text (T - some text) between the synthetic prefix (P - some text) and the synthetic suffix (S - some text) or the interned empty string if there is no match (this is creating permanent synthetic text by extraction):
let the beginning index be the index of the synthetic text P in the synthetic text T;
if beginning index is zero:
decide on "";
increase the beginning index by the length of the synthetic text P minus one;
let the end index be the index of the synthetic text S in the synthetic text T starting after index beginning index;
if the end index is zero:
decide on "";
let the length be the end index minus the beginning index;
decrement the length;
let the array address be the character array address of the synthetic text T plus the beginning index;
decide on a new permanent synthetic text extracted from the length bytes at address array address.
[@@]
To delete the synthetic text (T - some text) (this is deleting synthetic text):
let the text address be T converted to a number;
free the memory allocation at address text address minus eight.
Chapter "Private Synthetic Text Accessors and Mutators" - unindexed
[@@]
To write the length (L - a number) to the synthetic text (T - some text): (- llo_setField({T}, -1, {L}); -).
[@@]
Chapter "Public Synthetic Text Accessors and Mutators"
To decide what number is the length of the synthetic text (T - some text): (- llo_getField({T}, -1) -).
To decide what number is the character array address of the synthetic text (T - some text): (- ({T} + 1) -).
To decide what Unicode character is the character at index (I - a number) of the synthetic text (T - some text): (- llo_getByte({T} + {I}) -). [I is one-based, hence the absence of a plus one.]
To decide what number is the character code at index (I - a number) of the synthetic text (T - some text): (- llo_getByte({T} + {I}) -). [I is one-based, hence the absence of a plus one.]
To write (C - a Unicode character) to index (I - a number) of the synthetic text (T - some text): (- llo_setByte({T} + {I}, {C}); -). [I is one-based, hence the absence of a plus one.]
To write the character code (C - a number) to index (I - a number) of the synthetic text (T - some text): (- llo_setByte({T} + {I}, {C}); -). [I is one-based, hence the absence of a plus one.]
Section "Capture to Synthetic Text"
Include (-
Global llt_oldStream;
Global llt_stream;
Global llt_oldLength;
Global llt_length;
-) after "Definitions.i6t".
[Using a pure stream implementation, CocoaGlk spends too much time flushing streams, and as a result some ordinarily split-second operations in Debug File Parsing take half a minute or so. If we detect CocoaGlk, we want the option of the filter I/O system available.]
Include (-
[ llt_enter text;
#ifndef COCOA_QUIET;
llt_oldStream = glk_stream_get_current();
#endif;
! [@@]
llt_oldLength = llo_getField(text, -1);
if (llo_cocoaGlkDetected) {
#ifndef COCOA_QUIET;
glk_stream_set_current(0);
#endif;
! [@@]
llo_cocoaTargetAddress = text + 1;
llo_cocoaSpaceRemaining = llt_oldLength;
rtrue;
}
! [@@]
llt_stream = glk_stream_open_memory(text + 1, llt_oldLength, filemode_Write, 0);
if (llt_stream) {
glk_stream_set_current(llt_stream);
rtrue;
}
llt_length = 0;
rfalse;
];
[ llt_exit text;
#ifndef COCOA_QUIET;
glk_stream_set_current(llt_oldStream);
#endif;
if (llo_cocoaGlkDetected) {
llt_length = llt_oldLength - llo_cocoaSpaceRemaining;
} else {
glk_stream_close(llt_stream, llo_streamToStringMetrics);
llt_length = llo_getField(llo_streamToStringMetrics, 1);
}
if (llt_length < llt_oldLength) {
! [@@]
llo_setField(text, -1, llt_length);
! [@@]
llo_setByte(text + llt_length + 1, 0);
} else {
! [@@]
llo_setByte(text + llt_oldLength + 1, 0);
}
];
-).
To overwrite the synthetic text (T - some text) with the text printed when we (P - a phrase): (-
@push say__p;
@push say__pc;
@push say__n;
@push debug_rules;
debug_rules = 0;
@push llt_oldStream;
@push llt_stream;
@push llt_oldLength;
@push llt_length;
! [@@]
llt_oldLength = llo_getField({T}, -1);
@getiosys sp sp;
if (llo_cocoaGlkDetected) {
@setiosys 1 llo_cocoaPrint;
@push llo_cocoaTargetAddress;
@push llo_cocoaSpaceRemaining;
} else {
@setiosys 2 0;
}
if (llt_enter({T})) {
if(true) {P}
llt_exit({T});
}
if (llo_cocoaGlkDetected) {
@pull llo_cocoaSpaceRemaining;
@pull llo_cocoaTargetAddress;
}
@stkswap;
@setiosys sp sp;
@pull llt_length;
@pull llt_oldLength;
@pull llt_stream;
@pull llt_oldStream;
@pull debug_rules;
@pull say__n;
@pull say__pc;
@pull say__p;
-).
Section "Synthetic Text Case Changes"
To decide what number is the Latin-1 character code (C - a number) downcased: (- glk_char_to_lower({C}) -).
To decide what number is the Latin-1 character code (C - a number) upcased: (- glk_char_to_upper({C}) -).
To downcase the synthetic text (T - some text) (this is downcasing synthetic text):
let the length be the length of the synthetic text T;
let the character array address be the character array address of the synthetic text T;
let the limit be the character array address plus the length;
repeat with the character code address running from the character array address to the limit:
let the downcased character code be the Latin-1 character code the byte at address character code address downcased;
write the byte downcased character code to address character code address.
To upcase the synthetic text (T - some text) (this is upcasing synthetic text):
let the length be the length of the synthetic text T;
let the character array address be the character array address of the synthetic text T;
let the limit be the character array address plus the length;
repeat with the character code address running from the character array address to the limit:
let the downcased character code be the Latin-1 character code the byte at address character code address downcased;
write the byte downcased character code to address character code address.
Section "Synthetic Text Prefix and Suffix Removal"
To remove (N - a number) character/characters from the beginning of the synthetic text (T - some text) (this is removing characters from the beginning of synthetic text):
let the destination address be the character array address of the synthetic text T;
let the source address be the destination address plus N;
let the new length be the length of the synthetic text T minus N;
copy one plus the new length bytes from address source address to address destination address;
write the length new length to the synthetic text T.
To remove (N - a number) character/characters from the end of the synthetic text (T - some text) (this is removing characters from the end of synthetic text):
let the new length be the length of the synthetic text T minus N;
write the length new length to the synthetic text T;
let the terminator address be the new length plus the character array address of the synthetic text T;
write the byte zero to address the terminator address.
To possibly remove the synthetic text prefix (P - some text) from the beginning of the synthetic text (T - some text) (this is possibly removing a prefix from synthetic text):
if the synthetic text T begins with the synthetic text P:
let the prefix length be the length of the synthetic text P;
remove prefix length characters from the beginning of the synthetic text T.
To decide whether we successfully remove the synthetic text prefix (P - some text) from the beginning of the synthetic text (T - some text) (this is reporting success or failure after possibly removing a prefix from synthetic text):
if the synthetic text T begins with the synthetic text P:
let the prefix length be the length of the synthetic text P;
remove prefix length characters from the beginning of the synthetic text T;
decide yes;
decide no.
To possibly remove the synthetic text suffix (S - some text) from the end of the synthetic text (T - some text) (this is possibly removing a suffix from synthetic text):
if the synthetic text T ends with the synthetic text S:
let the suffix length be the length of the synthetic text S;
remove suffix length characters from the end of the synthetic text T.
To decide whether we successfully remove the synthetic text suffix (S - some text) from the end of the synthetic text (T - some text) (this is reporting success or failure after possibly removing a suffix from synthetic text):
if the synthetic text T ends with the synthetic text S:
let the suffix length be the length of the synthetic text S;
remove suffix length characters from the end of the synthetic text T;
decide yes;
decide no.
Section "Iteration over Synthetic Text"
Include (-
Global llt_iterator;
-) after "Definitions.i6t".
To repeat with (I - a nonexisting number variable) running through the character codes in the synthetic text (T - some text) begin -- end: (-
! [@@]
llt_iterator = {T} + 1;
jump LLO_LOOP_{-counter:LLO_LOOP}_ENTRY;
for (::)
if (llo_advance) {
@pull llt_iterator;
if (llo_broken) {
break;
}
llt_iterator++;
.LLO_LOOP_{-advance-counter:LLO_LOOP}_ENTRY;
@aloadb llt_iterator 0 {I};
if (~~{I}) {
llo_broken = true;
break;
}
@push llt_iterator;
llo_advance = false;
} else for(llo_oneTime = true, llo_broken = true, llo_advance = true: llo_oneTime && ((llo_oneTime = false), true) || (llo_broken = false):)
-).
[@]
Chapter "Synthetic Text Equality Tests"
To decide whether the synthetic text (S - some text) is identical to the synthetic text (T - some text) (this is testing equality between synthetic text and synthetic text):
decide on whether or not the length of the synthetic text S is the length of the synthetic text T and the synthetic text S begins with the synthetic text T.
To decide whether the synthetic text (S - some text) is identical to (T - some text) (this is testing equality between synthetic text and text):
let the conversion be a new synthetic text copied from T;
let the result be whether or not the synthetic text S is identical to the synthetic text the conversion;
delete the synthetic text conversion;
decide on the result.
To decide whether (S - some text) is identical to the synthetic text (T - some text) (this is testing equality between text and synthetic text):
let the conversion be a new synthetic text copied from S;
let the result be whether or not the synthetic text the conversion is identical to the synthetic text T;
delete the synthetic text conversion;
decide on the result.
To decide whether (S - some text) is identical to (T - some text) (this is testing equality between text and text):
let the conversion of S be a new synthetic text copied from S;
let the conversion of T be a new synthetic text copied from T;
let the result be whether or not the synthetic text the conversion of S is identical to the synthetic text the conversion of T;
delete the synthetic text conversion of S;
delete the synthetic text conversion of T;
decide on the result.
Chapter "Synthetic Text Substring Tests"
To decide what number is the index of (C - a Unicode character) in the synthetic text (T - some text) (this is finding a character in synthetic text):
let the length be the length of the synthetic text T;
let the array address be the character array address of the synthetic text T;
decide on one plus the index of the byte C converted to a number in the length bytes at address array address.
To decide what number is the index of the character code (C - a number) in the synthetic text (T - some text) (this is finding a character code in synthetic text):
let the length be the length of the synthetic text T;
let the array address be the character array address of the synthetic text T;
decide on one plus the index of the byte C in the length bytes at address array address.
To decide what number is the index of the synthetic text (S - some text) in the synthetic text (T - some text) (this is finding synthetic text in synthetic text):
let the first length be the length of the synthetic text S;
let the second length be the length of the synthetic text T;
let the first array address be the character array address of the synthetic text S;
let the second array address be the character array address of the synthetic text T;
decide on one plus the index of the first length bytes at address first array address in the second length bytes at address second array address.
To decide what number is the index of (C - a Unicode character) in the synthetic text (T - some text) starting after index (I - a number) (this is finding a subsequent character in synthetic text):
let the length be the length of the synthetic text T minus I;
if the length is at most zero:
decide on zero;
let the array address be I plus the character array address of the synthetic text T;
let the result be the index of the byte C converted to a number in the length bytes at address array address;
if the result is less than zero:
decide on zero;
decide on one plus I plus the result.
To decide what number is the index of the character code (C - a number) in the synthetic text (T - some text) starting after index (I - a number) (this is finding a subsequent character code in synthetic text):
let the length be the length of the synthetic text T minus I;
if the length is at most zero:
decide on zero;
let the array address be I plus the character array address of the synthetic text T;
let the result be the index of the byte C in the length bytes at address array address;
if the result is less than zero:
decide on zero;
decide on one plus I plus the result.
To decide what number is the index of the synthetic text (S - some text) in the synthetic text (T - some text) starting after index (I - a number) (this is finding subsequent synthetic text in synthetic text):
let the second length be the length of the synthetic text T minus I;
if the second length is at most zero:
decide on zero;
let the first length be the length of the synthetic text S;
let the first array address be the character array address of the synthetic text S;
let the second array address be I plus the character array address of the synthetic text T;
let the result be the index of the first length bytes at address first array address in the second length bytes at address second array address;
if the result is less than zero:
decide on zero;
decide on one plus I plus the result.
Chapter "Synthetic Text Prefix and Suffix Tests"
To decide whether the synthetic text (T - some text) begins with the synthetic text (S - some text) (this is testing for a synthetic text prefix):
decide on whether or not the index of the synthetic text S in the synthetic text T is one.
To decide whether the synthetic text (T - some text) ends with the synthetic text (S - some text) (this is testing for a synthetic text suffix):
let the excess be the length of the synthetic text T minus the length of the synthetic text S;
decide on whether or not the excess is at least zero and the index of the synthetic text S in the synthetic text T starting after index excess is one plus the excess.
Low-Level Text ends here.
---- DOCUMENTATION ----
Chapter: Synopsis
[@]
When we build on the extension Glulx Runtime Instrumentation Framework to create
Inform debugging tools, we often encounter situations where Inform's block value
management system needs to remain untouched, and we are unable to use Inform's
indexed text. Low-Level Text contains a replacement text implementation that,
while less cleanly encapsulated, is a safe alternative even in these scenarios.
Details are in the following chapters.
Chapter: Usage
[@]
As story authors we have Inform's indexed text for text that we generate on the
fly. But when we write instrumentation with the Glulx Runtime Instrumentation
Framework, any phrases involving Inform's block value management system are
unsafe, so indexed text is not an option. Low-Level Text offers a modest
replacement.
As with all of these replacement extensions, the adjectives "low-level" and
"modest" are important: the phrases in Low-Level Text give us more chances to
commit subtle and hard-to-debug errors. Unless we have a compelling reason,
like the fact that we're writing instrumentation, we will be better off with the
safer implementation provided by the Inform language.
Section: Overview
Low-Level Text introduces a new kind of text, synthetic text. It behaves very
much like indexed text except that we cannot change its length, we cannot use
non-Latin-1 characters (at least in the current version) (see
http://en.wikipedia.org/wiki/ISO/IEC_8859-1 for a description of the Latin-1
encoding and its list of characters), and we must watch out for two potential
pitfalls:
First, Inform understandably does not allow declarations like
Synthetic text is a kind of text.
unless we use some awkward per-project template hacking. So instead, it is up
to us to keep track of which texts are synthetic and which are not; the story
will likely crash if we apply synthetic text phrases to ordinary text.
[@]
Second, we must be very careful with the word "is". Suppose that we have
synthetic text called X with the contents "plugh". The line
showme whether or not X is "plugh";
will, perhaps surprisingly, say false. "Is" compares the identity of texts, and
X is separate from Inform's copy of the word, even if it happens to have the
same string of characters. Similarly, if Y is a distinct synthetic text also
containing "plugh",
showme whether or not X is Y;
will print false. See the section titled "Comparing" for phrases that compare
the contents of synthetic text.
Section: Creating and destroying synthetic text
We can obtain a gibberish text by asking for
a new uninitialized synthetic text with length (L - a number) character/characters
and then correct its contents later. We can copy characters from another text
(ordinary or synthetic, so long as it doesn't contain substitutions that will
change length after being said), as in
a new synthetic text copied from (T - some text)
Or we can build text from a byte array of character codes:
a new synthetic text extracted from the (N - a number) bytes at address (A - a number)
If we add the word "permanent" before the words "synthetic text" in these
phrases, we will get back a synthetic text that cannot be subsequently
destroyed, but will affect story performance less, at least under some
interpreters.
When we're done with synthetic text that isn't permanent, we should ask the VM
to reclaim its memory:
delete the synthetic text (T - some text)
Section: Measuring and indexing
In Low-Level Operations, text is measured with the phrase
the length of the text (T - some text)
If we know T to be synthetic text, we can substitute the slightly faster phrase
the length of the synthetic text (T - some text)
The contents of synthetic text are available as the address of a byte array of
character codes:
the character array address of the synthetic text (T - some text)
as single characters:
the character at index (I - a number) of the synthetic text (T - some text)
as single character codes:
the character code at index (I - a number) of the synthetic text (T - some text)
or as character codes traversed by a loop:
repeat with (I - a nonexisting number variable) running through the character codes in the synthetic text (T - some text):
....
Note that, as with indexed text, character indices are numbered from one.
To change the contents, we have the phrase
write (C - a Unicode character) to index (I - a number) of the synthetic text (T - some text)
and its relative
write the character code (C - a number) to index (I - a number) of the synthetic text (T - some text)
Another option is the phrase
overwrite the synthetic text (T - some text) with the text printed when we ...
where the ellipses can be any one line of Inform code, just as with an "if"
followed by a comma. For instance,
overwrite the synthetic text T with the text printed when we say "Hi!";
is legal, but not
overwrite the synthetic text T with the text printed when we repeat with foo running from one to two:
say "[foo]";
T will be replaced with whatever is said by that line, except that the output
may be cut off if it is longer than T has room to store.
[@]
Section: Comparing
We can compare the contents---rather than the identities---of two texts by
asking
if (S - some text) is identical to (T - some text):
....
Faster-running variations on this pattern are available if we know that S or T
is synthetic:
if (S - some text) is identical to the synthetic text (T - some text):
....
if the synthetic text (S - some text) is identical to (T - some text):
....
and
if the synthetic text (S - some text) is identical to the synthetic text (T - some text):
....
Section: Searching
We can request the location of one synthetic text in another:
the index of the synthetic text (S - some text) in the synthetic text (T - some text)
The result here is the (one-based) index of the first character of the first
exact copy of S in T, or zero if no exact copy exists. To find later matches we
can ask for
the index of the synthetic text (S - some text) in the synthetic text (T - some text) starting after index (I - a number)
which behaves identically except that it only considers occurrences of S that
begin after index I (an occurrence at index I is not considered).
If we are searching for a single character, the phrases
the index of (C - a Unicode character) in the synthetic text (T - some text)
and
the index of (C - a Unicode character) in the synthetic text (T - some text) starting after index (I - a number)
behave analogously, while the phrases
the index of the character code (C - a number) in the synthetic text (T - some text)
and
the index of the character code (C - a number) in the synthetic text (T - some text) starting after index (I - a number)
take character codes in place of characters.
In a similar vein we can check text for a prefix,
the synthetic text (T - some text) begins with the synthetic text (S - some text)
....
or a suffix:
the synthetic text (T - some text) ends with the synthetic text (S - some text)
....
Section: Recasing
We can overwrite synthetic text with its lowercase version via the phrase
downcase the synthetic text (T - some text)
In the same way, to make T uppercase, we use the phrase
upcase the synthetic text (T - some text)
The same operations on Latin-1 character codes are accomplished by requesting
the Latin-1 character code (C - a number) downcased
or
the Latin-1 character code (C - a number) upcased
Section: Trimming
To remove characters from the beginning or end of synthetic text, we have the
two phrases
remove (N - a number) character/characters from the beginning of the synthetic text (T - some text)
and
remove (N - a number) character/characters from the end of the synthetic text (T - some text)
A often more convenient pair of phrases,
possibly remove the synthetic text prefix (P - some text) from the beginning of the synthetic text (T - some text)
and
possibly remove the synthetic text suffix (S - some text) from the end of the synthetic text (T - some text)
do the same thing, where N is given by the length of P or S, except that they
first check that characters to destroy are exactly those in P or S, and do
nothing if they are not. Yet another variation has the two phrases
if we successfully remove the synthetic text prefix (P - some text) from the beginning of the synthetic text (T - some text):
....
and
if we successfully remove the synthetic text suffix (S - some text) from the end of the synthetic text (T - some text):
....
which do the same, but this time decide yes if and only if the removal actually
took place.
Section: Extracting
The last possibility is a tad more eccentric than the others, and mainly
provided to support other extensions in the Glulx Runtime Instrumentation
Project. Still, here it is:
If we ask for
a new synthetic text extracted from the synthetic text (T - some text) between the synthetic prefix (P - some text) and the synthetic suffix (S - some text) or the interned empty string if there is no match
we get back, as a synthetic text, all of the text between the first occurrence
of P and the first subsequent occurrence of S, both in T. But if either P or S
cannot be found, we don't get a synthetic text at all---we get Inform's copy of
the empty string, "". The qualifier "permanent" can be used here, just as with
the other synthetic text constructors.
Chapter: Requirements, Limitations, and Bugs
This version was tested with Inform 6G60. It may not function under other
versions.
Section: Obscure limitations, which should affect almost nobody
A few actions, like throwing an exception or calling glk_set_stream, ought not
to take place inside text routines used with these phrases, for reasons that are
hopefully obvious. Inform will never pull such shenanigans on its own.
Section: Regarding bugs
If you encounter a bug, check first on the project website
(https://github.com/i7/i7grip) to see whether a newer version of this extension
is available. If, even using the latest version, the fault remains, please file
a bug report: On the website, choose "Issues" from the toolbar and click on "New
Issue".
I will try to respond quickly, at least with an estimate of when the bug might
be fixed, though sometimes I am away from the internet for a week or two at a
time.
Chapter: Acknowledgements
Low-Level Text was prepared as part of the Glulx Runtime Instrumentation Project
(https://github.com/i7/i7grip).
GRIP owes a great deal to everyone who made Inform possible and everyone who
continues to contribute. I'd like to give especial thanks to Graham Nelson and
Emily Short, not only for their design and coding work, but also for all of the
documentation, both of the language and its internals---it proved indispensable.
I am likewise indebted to everybody who worked to make Glulx and Glk a reality.
Without them, there simply wouldn't have been any hope for this kind of project.
My special thanks to Andrew Plotkin, with further kudos for his work maintaining
the specifications. They proved as essential as Inform's documentation.
The project itself was inspired by suggestions from Ron Newcomb and Esteban
Montecristo on Inform's feature request page. It's only because of their posts
that I ever started. (And here's hoping that late is better than never.)
Esteban Montecristo also made invaluable contributions as an alpha tester. I
cannot thank him enough: he signed on as a beta tester but then quickly
uncovered a slew of problems that forced me to reconsider both the term ``beta''
and my timeline. The impetus for the new, cleaner design and several clues that
led to huge performance improvements are all due to him. Moreover, he
contributed code, since modified to fit the revised framework, for the extension
Verbose Diagnostics.
As for Ron Newcomb, I can credit him for nearly half of the bugs unearthed in
the beta proper, not to mention sound advice on the organization of the
documentation and the extensions. GRIP is much sturdier as a result.
Roger Carbol, Jesse McGrew, Michael Martin, Dan Shiovitz, Johnny Rivera, and
probably several others deserve similar thanks for answering questions on
ifMUD's I6 and I7 channels. I am grateful to Andrew Plotkin, David Kinder, and
others for the same sort of help on intfiction.org.
On top of that, David Kinder was kind enough to accommodate Debug File Parsing
in the Windows IDE; consequently, authors who have a sufficiently recent version
of Windows no longer need to write batch scripts. His help is much appreciated,
particularly because the majority of downloaders are running Windows.
Even with the IDEs creating debug files, setting up symbolic links to those
files can be a chore. Jim Aiken suggested an automated solution, which now
ships with the project.
And preliminary support for authors who want to debug inside a browser stems
from discussion with Erik Temple and Andrew Plotkin; my thanks for their ideas.
Finally, I should take this opportunity to express my gratitude to everyone who
helped me get involved in the IF community. Notable among these people are
Jesse McGrew and Emily Short, not to mention Jacqueline Lott, David Welbourn,
and all of the other Club Floyd attendees.