Skip to content

Commit

Permalink
omit optional args; change csv newline handling
Browse files Browse the repository at this point in the history
1. Allow JavaScript-style omission of optional args
   by simply omitting a token. See end of RemesPath.md.
2. Add header_handling optional 6th arg for s_csv
3. Make it so newlines in strings are no longer escaped
    when outputting CSV files. Instead newline-containing strings
    will be wrapped in quotes.
  • Loading branch information
molsonkiko committed Nov 28, 2023
1 parent b53c7c8 commit c8e44c8
Show file tree
Hide file tree
Showing 13 changed files with 369 additions and 200 deletions.
10 changes: 8 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
]
}
```
3. Add option for users to choose newline in JSON Lines and JSON-to-CSV.
3. Add option for users to choose newline in JSON Lines.
4. Option for users to choose whether to escape newlines in CSV files.

### To Be Changed

Expand All @@ -47,15 +48,20 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
1. Option to customize which [toolbar icons](/docs/README.md#toolbar-icons) are displayed, and their order.
2. [For loops in RemesPath](/docs/RemesPath.md#for-loopsloop-variables-added-in-v60)
3. [`bool, s_csv` and `s_fa` vectorized arg functions](/docs/RemesPath.md#vectorized-functions) and [`randint` non-vectorized arg function](/docs/RemesPath.md#non-vectorized-functions) to RemesPath.
*
4. Make second argument of [`s_split` RemesPath function](/docs/RemesPath.md#vectorized-functions) optional; 1-argument variant splits on whitespace.

### Changed

1. The way object keys are represented internally has been changed (*this has no effect on the GUI-based API, but only for developers*). Previously, when pretty-printing and compressing JSON, object keys would be output as is (without escaping special characters), meaning that *prior to v6.0, some strings were not valid object keys (again, this did not affect parsing of JSON, but only some programmatic applications that constructed JOBjects directly without parsing).* Now all strings are valid object keys.
2. When using the JSON-to-CSV form to create CSV files, newline characters will no longer be escaped in strings. Instead, strings containing newlines will be wrapped in quotes, which should be sufficient to allow most CSV parsers to handle them correctly.

### Fixed

1. Fixed plugin crash when attempting to parse too-large hex numbers like `0x100000000000000000000`. Now the parser will fatally fail and add a lint indicating the issue, but the plugin will not actually crash.
2. Fixed some weird issues where mutating a variable in RemesPath could cause re-executing a query on the same input to return a different value. A minimal example: `var x = 1; x = @ + 1; x` would return 1 + (the number of times the query was executed) prior to this fix, but now it will always return `2` as expected. This was also true of a bunch of other things in RemesPath, including [projections and the map operator](/docs/RemesPath.md#projections).
3. Fix issues where running a RemesPath query with a projection that referenced a variable indexing on a compile-time constant would cause an error. For example, `var x = @; 1->x` should return `@` (the input to the query), but prior to this fix, it would instead cause an error.
4. Running tests would previously cause clipboard data to be lost irreversably. Now, if the user's clipboard contained text before running tests, the contents of the clipboard are restored to their pre-test values rather than being hijacked. __Non-text data that was copied to the clipboard is still lost when running tests, and I may try to fix that in the future.__
5. `dict` function in RemesPath previously had a bug that could create invalid JSON if the strings to be turned into keys contained special characters (e.g., literal quote chars, `\r`, `\n`).

## [5.8.0] - 2023-10-09

Expand Down
80 changes: 40 additions & 40 deletions JsonToolsNppPlugin/JSONTools/JNode.cs
Original file line number Diff line number Diff line change
Expand Up @@ -286,48 +286,48 @@ public JNode(bool value, int position = 0)
};
#region TOSTRING_FUNCS
/// <summary>
/// string representation of any characters in JSON
/// appends the JSON representation of char c to a StringBuilder.<br></br>
/// for most characters, this just means appending the character itself, but for example '\n' would become "\\n", '\t' would become "\\t",<br></br>
/// and most other chars less than 32 would be appended as "\\u00{char value in hex}" (e.g., '\x14' becomes "\\u0014")
/// </summary>
/// <param name="c"></param>
/// <returns></returns>
public static string CharToString(char c)
public static void CharToSb(StringBuilder sb, char c)
{
switch (c)
{
case '\\': return "\\\\";
case '"': return "\\\"";
case '\x01': return "\\u0001";
case '\x02': return "\\u0002";
case '\x03': return "\\u0003";
case '\x04': return "\\u0004";
case '\x05': return "\\u0005";
case '\x06': return "\\u0006";
case '\x07': return "\\u0007";
case '\x08': return "\\b";
case '\x09': return "\\t";
case '\x0A': return "\\n";
case '\x0B': return "\\v";
case '\x0C': return "\\f";
case '\x0D': return "\\r";
case '\x0E': return "\\u000E";
case '\x0F': return "\\u000F";
case '\x10': return "\\u0010";
case '\x11': return "\\u0011";
case '\x12': return "\\u0012";
case '\x13': return "\\u0013";
case '\x14': return "\\u0014";
case '\x15': return "\\u0015";
case '\x16': return "\\u0016";
case '\x17': return "\\u0017";
case '\x18': return "\\u0018";
case '\x19': return "\\u0019";
case '\x1A': return "\\u001A";
case '\x1B': return "\\u001B";
case '\x1C': return "\\u001C";
case '\x1D': return "\\u001D";
case '\x1E': return "\\u001E";
case '\x1F': return "\\u001F";
default: return new string(c, 1);
case '\\': sb.Append("\\\\" ); break;
case '"': sb.Append("\\\"" ); break;
case '\x01': sb.Append("\\u0001"); break;
case '\x02': sb.Append("\\u0002"); break;
case '\x03': sb.Append("\\u0003"); break;
case '\x04': sb.Append("\\u0004"); break;
case '\x05': sb.Append("\\u0005"); break;
case '\x06': sb.Append("\\u0006"); break;
case '\x07': sb.Append("\\u0007"); break;
case '\x08': sb.Append("\\b" ); break;
case '\x09': sb.Append("\\t" ); break;
case '\x0A': sb.Append("\\n" ); break;
case '\x0B': sb.Append("\\v" ); break;
case '\x0C': sb.Append("\\f" ); break;
case '\x0D': sb.Append("\\r" ); break;
case '\x0E': sb.Append("\\u000E"); break;
case '\x0F': sb.Append("\\u000F"); break;
case '\x10': sb.Append("\\u0010"); break;
case '\x11': sb.Append("\\u0011"); break;
case '\x12': sb.Append("\\u0012"); break;
case '\x13': sb.Append("\\u0013"); break;
case '\x14': sb.Append("\\u0014"); break;
case '\x15': sb.Append("\\u0015"); break;
case '\x16': sb.Append("\\u0016"); break;
case '\x17': sb.Append("\\u0017"); break;
case '\x18': sb.Append("\\u0018"); break;
case '\x19': sb.Append("\\u0019"); break;
case '\x1A': sb.Append("\\u001A"); break;
case '\x1B': sb.Append("\\u001B"); break;
case '\x1C': sb.Append("\\u001C"); break;
case '\x1D': sb.Append("\\u001D"); break;
case '\x1E': sb.Append("\\u001E"); break;
case '\x1F': sb.Append("\\u001F"); break;
default: sb.Append(c); break;
}
}

Expand All @@ -343,7 +343,7 @@ public static string StrToString(string s, bool quoted)
if (quoted)
sb.Append('"');
foreach (char c in s)
sb.Append(CharToString(c));
CharToSb(sb, c);
if (quoted)
sb.Append('"');
return sb.ToString();
Expand Down Expand Up @@ -374,7 +374,7 @@ public virtual string ToString(bool sort_keys = true, string key_value_sep = ":
var sb = new StringBuilder();
sb.Append('"');
foreach (char c in (string)value)
sb.Append(CharToString(c));
CharToSb(sb, c);
sb.Append('"');
return sb.ToString();
}
Expand Down
10 changes: 5 additions & 5 deletions JsonToolsNppPlugin/JSONTools/JsonParser.cs
Original file line number Diff line number Diff line change
Expand Up @@ -696,7 +696,7 @@ public string ParseKey(string inp)
char nextChar = inp[ii + 1];
if (nextChar == quoteChar)
{
sb.Append(JNode.CharToString(quoteChar));
JNode.CharToSb(sb, quoteChar);
ii++;
}
else if (ESCAPE_MAP.TryGetValue(nextChar, out _))
Expand All @@ -713,7 +713,7 @@ public string ParseKey(string inp)
int nextHex = ParseHexChar(inp, 4);
if (HandleCharErrors(nextHex, inp, ii))
break;
sb.Append(JNode.CharToString((char)nextHex));
JNode.CharToSb(sb, (char)nextHex);
}
else if (nextChar == '\n' || nextChar == '\r')
{
Expand All @@ -730,7 +730,7 @@ public string ParseKey(string inp)
if (HandleCharErrors(nextHex, inp, ii))
break;
HandleError("\\x escapes are only allowed in JSON5", inp, ii, ParserState.JSON5);
sb.Append(JNode.CharToString((char)nextHex));
JNode.CharToSb(sb, (char)nextHex);
}
else HandleError($"Escaped char '{nextChar}' is only valid in JSON5", inp, ii + 1, ParserState.JSON5);
}
Expand All @@ -740,7 +740,7 @@ public string ParseKey(string inp)
HandleError($"Object key contains newline", inp, ii, ParserState.BAD);
else
HandleError("Control characters (ASCII code less than 0x20) are disallowed inside strings under the strict JSON specification", inp, ii, ParserState.OK);
sb.Append(JNode.CharToString(c));
JNode.CharToSb(sb, c);
}
else
{
Expand Down Expand Up @@ -790,7 +790,7 @@ public string ParseUnquotedKeyHelper(string inp, string result)
char hexval = (char)int.Parse(m.Value, NumberStyles.HexNumber);
if (HandleCharErrors(hexval, inp, ii))
return null;
sb.Append(JNode.CharToString(hexval));
JNode.CharToSb(sb, hexval);
start = m.Index + 4;
m = m.NextMatch();
}
Expand Down
41 changes: 10 additions & 31 deletions JsonToolsNppPlugin/JSONTools/JsonTabularize.cs
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,7 @@ Uses an algorithm to flatten nested JSON.
*/
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using JSON_Tools.Utils;

namespace JSON_Tools.JSON_Tools
{
Expand Down Expand Up @@ -798,52 +796,33 @@ public JArray BuildTable(JNode obj, Dictionary<string, object> schema, string ke
}

/// <summary>
/// If a string contains the delimiter, wrap that string in quotes.<br></br>
/// Also replace any newlines with "\\n".<br></br>
/// Otherwise, leave it alone.
/// If a string contains the delimiter or a newline, append (the string wrapped in quotes) to sb<br></br>
/// Otherwise, append s to sb.
/// </summary>
/// <param name="s"></param>
/// <param name="delim"></param>
/// <param name="quote_char"></param>
/// <returns></returns>
private string ApplyQuotesIfNeeded(string s, char delim, char quote_char)
private void ApplyQuotesIfNeeded(StringBuilder sb, string s, char delim, char quote_char, string newline)
{
StringBuilder sb = new StringBuilder();
if (s.Contains(delim))
if (s.IndexOf(delim) >= 0 || s.IndexOf(newline) >= 0)
{
// if the string contains the delimiter, we need to wrap it in quotes
// if the string contains the delimiter or a newline, we need to wrap it in quotes
// we also need to escape all literal quote characters in the string
// regardless of what happens, we need to replace internal newlines with "\\n"
// so we don't get fake rows
sb.Append(quote_char);
foreach (char c in s)
{
{
if (c == quote_char)
{
sb.Append('\\');
sb.Append(quote_char);
}
else if (c == '\n')
sb.Append("\\n");
else if (c == '\r')
sb.Append("\\r");
else
sb.Append(c);
}
}
sb.Append(quote_char);
return sb.ToString();
}
// just replace newlines
foreach (char c in s)
{
if (c == '\n')
sb.Append("\\n");
else if (c == '\r')
sb.Append("\\r");
else
sb.Append(c);
}
return sb.ToString();
else sb.Append(s);
}

public string TableToCsv(JArray table, char delim = ',', char quote_char = '"', string[] header = null, bool bools_as_ints = false, string newline = "\n")
Expand Down Expand Up @@ -871,7 +850,7 @@ public string TableToCsv(JArray table, char delim = ',', char quote_char = '"',
for (int ii = 0; ii < header.Length; ii++)
{
string col = header[ii];
sb.Append(ApplyQuotesIfNeeded(col, delim, quote_char));
ApplyQuotesIfNeeded(sb, col, delim, quote_char, newline);
if (ii < header.Length - 1) sb.Append(delim);
}
sb.Append(newline);
Expand All @@ -891,7 +870,7 @@ public string TableToCsv(JArray table, char delim = ',', char quote_char = '"',
switch (val.type)
{
case Dtype.STR:
sb.Append(ApplyQuotesIfNeeded((string)val.value, delim, quote_char));
ApplyQuotesIfNeeded(sb, (string)val.value, delim, quote_char, newline);
break; // only apply quotes if internal delim
case Dtype.DATE:
sb.Append(((DateTime)val.value).ToString("yyyy-MM-dd"));
Expand Down
18 changes: 18 additions & 0 deletions JsonToolsNppPlugin/JSONTools/RemesPath.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2002,6 +2002,24 @@ private Obj_Pos ParseArgFunction(List<object> toks, int pos, ArgFunction fun, in
while (pos < end)
{
t = toks[pos];
if (t is char d_ && (d_ == ',' || d_ == ')'))
{
if (!(fun.maxArgs > fun.minArgs && arg_num >= fun.minArgs
&& ((d_ == ',' && arg_num < fun.maxArgs - 1) // ignore an optional arg that's not the last arg. e.g., "foo(a,,1)", where the second and third args are optional.
|| d_ == ')'))) // ignore the last arg if it is optional. e.g., "foo(a,)", where all args after the first are optional.
throw new RemesPathArgumentException("Omitting a required argument for a function is not allowed", arg_num, fun);
// set defaults to optional args JavaScript-style, by simply omitting a token where the argument would normally go.
args.Add(new JNode());
arg_num++;
pos++;
if (d_ == ')') // last arg was omitted and optional
{
var withargs = new ArgFunctionWithArgs(fun, args);
fun.PadToMaxArgs(args);
return new Obj_Pos(ApplyArgFunction(withargs), pos);
}
continue;
}
// the last Dtype in an ArgFunction's input_types is either the type options for the last arg
// or the type options for every optional arg (if the function can have infinitely many args)
Dtype type_options = fun.TypeOptions(arg_num);
Expand Down
Loading

0 comments on commit c8e44c8

Please sign in to comment.