From 6b81acb73a30dd74d0156b476ea66289c4b2df9a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jonathan=20Sch=C3=B6bel?= <jonathan@xn--schbel-yxa.info> Date: Sun, 8 Oct 2023 16:38:54 +0200 Subject: [PATCH] docs: collect commit messages #4 Up to (and including) commit 9518d8356acbe88f943293898cd65d85c79a007d 'Merge branch 'feature/text'' --- docs/commit_messages.txt | 164 +++++++++++++++++++++++++++++++++++++-- sefht.geany | 2 +- 2 files changed, 160 insertions(+), 6 deletions(-) diff --git a/docs/commit_messages.txt b/docs/commit_messages.txt index 4cd6f96..ff36829 100644 --- a/docs/commit_messages.txt +++ b/docs/commit_messages.txt @@ -9,8 +9,38 @@ sefht.geany: file also often creates merge conflicts, which could be avoided, if this file was not tracked by the VCS. +.gitignore: + Note that in this project it is choosen, to not include + generated files into the version control system, but of course + they must be always included for distributions. + configure.ac: This package uses the GNU Autotools. + Until now, the configure script just checked for check to be installed, + which is needed to compile the tests. + Now, configure provides a conditional (MISSING_CHECK) depending on its + presence for use by automake. If check is missing, the tests aren't + compiled. Instead a special script is executed to inform the user of the + problem and stops the testsuite. Note, that it was not possible to + directly stop the generation of the testsuite by injecting a rule to a + Makefile without relying on implementation details of automake. + See: + https://stackoverflow.com/questions/76376806/automake-how-to-portably-throw-an-error-and-aborting-the-target/76382437 + To allow the script to issue messages to stderr, AM_TESTS_FD_REDIRECT is + used, because the parallel test harness redirects output of its tests to + logfiles. This isn't used for the serial test harness, because there is + no redirection to logfiles, but there AM_TESTS_FD_REDIRECT is also not + taken into account. + See: + https://www.gnu.org/software/automake/manual/html_node/Testsuite-Environment-Overrides.html + Additionaly configure also provides an argument to enforce both + behaviours. When specifying --enable-tests=no the tests are not compiled + regardless of the presence of check. If --enable-tests=yes, it is + assumed, that tests are really needed and the mandantory check for check + is performed thus providing the former behaviour. If not specified + --enable-tests default to auto, which results in the same behaviour as + --enable-tests=yes, if check is present, and like --enable-tests=no + otherwise. main.c: As this project is about a library, a main.c would not be @@ -43,6 +73,8 @@ Compilation: buggy. Testing the later would defy the concept of unittests. Thus, the easiest way to test for a proper working function, is to test for the internal state. + To also save the instructions for the call, which are a lot of + overhead now, link time optimization is turned on. (-flto) Error handling: Error handling is done by the status structure. The name was @@ -124,6 +156,32 @@ Error handling: worked, nor actually took place. [citation needed] Those the operation can be retried (hopefully). +raw methods: + The library provides a way to directly access the tag in a read-only + way, which saves an call to strdup. This is useful if only reading is + necessary, but needs special care by developers, as it is neither + allowed to modify it nor to free it. Disregarding this will lead to a + segfault in the best, and to silent data corruption and security bugs in + the worst case. + When there are methods in the api/abi, that take pointers to strings to + store them in the library, there are two methods to do so. Either they + are copying the string and leaving it intact, or they directly assign + the given pointer to some internal storage. While the former method, is + safer in terms of memory, as the user doesn't have to remember that he + can't use the string anymore, the latter can be more efficient, as there + is no extra strdup call, but the user is not allowed to change the + pointer, free it and also can't use the pointer, because it can't be + known whether it is already freed by the library. As it should be + decideable by the user, the library often implements both approaches, + where the method, that directly store pointers without creating a copy + contains the raw_ prefix. + +goto: + Sometimes the common code to cleanup in case of an error is + bundled at the end of a function. For people complaining about + the use of goto: this is the exact use case, where it is + recommended! + splint: The source has been adapted to splint, which still tells about some errors, but they are all checked to be false-positives. @@ -141,6 +199,13 @@ Data: contains some caches. At the moment it only provides access to the Validator. +Attr: + The structure SH_Attr implements an HTML Attribute. + For every function there is also a static method/function, + which can perform the same work, but doesn't rely on really + having a single struct Attr. This is useful for example in an + array to manipulate a single element. + Fragment: Fragment is the core of SeFHT. (As the name suggests) A Fragment can be every part of a website. The website is @@ -156,11 +221,14 @@ Fragment: represented by a structure of function pointers. The data needed by a fragment is, a pointer to the Data object, which is needed for getting any kind of information a fragment - might need, and a pointer to the parent node, which is needed - both for access to it and also to ensure, that each fragment - has exactly one parent. This is necessary to prevent data - corruption and also to keep clear who is responsible, for - freeing the fragment. + might need, and a pointer to the parent node, which is useful + for both traversing the tree and checking for cycles, i.e. that + each fragment has exactly one parent, when a node is added. + This is necessary to prevent data corruption and also to keep + clear who is responsible, for freeing the fragment. + Both, traversing and ensuring consistency, wouldn't be possible + otherwise. + The methods each fragment has to be implement are a copy method, a free method (destructor) and a method to output the html. Also every class has a method, which checks, if a given fragment @@ -185,6 +253,78 @@ Fragment: Adding the same element twice in the tree (graph) isn't possible, as this would lead to problems e.g. double free or similar quirks. + + The single method (formerly SH_NodeFragment_append_child) to add a child + at the end of the child list was replaced, by a bunch of methods to + insert a child at the beginning (SH_NodeFragment_prepend_child), at the + end (SH_NodeFragment_append_child), at a specific position + (SH_NodeFragment_insert_child) and directly before + (SH_NodeFragment_insert_child_before) or after another child + (SH_NodeFragment_insert_child_after). All these methods are implemented + by a single internal one (insert_child), as there isn't really much + difference in inserting one or the other way. + But this internal method doesn't check whether this insertion request is + actually doable, to save overhead as not every insertion method requires + this check. This is done by the respective method. However if the check + is not done correctly the internal method will attempt to write at not + allocated space, which will hopefully result in a segfault. + + The child list is implemented as an array. To reduce the overhead to + realloc calls, the array is allocated in chunks of childs. The + calculation how many has to be allocated is done by another static + method and determined by the macro CHILD_CHUNK. This is set to 5, which + is just a guess. It should be somewhere around the average number of + childs per html element, to reduce unused overhead. + + Also some predicates (SH_NodeFragment_is_parent, + SH_NodeFragment_is_ancestor) were added to check whether a relationship + exists between to nodes, thus whether they are linked through one or + multiple levels. These functions could replace the old ones + (SH_NodeFragment_is_child, SH_NodeFragment_is_descendant) semantically. + Furthermore they are more efficient as this is now possible to check + over the parent pointer. The internal insert method also uses these + methods to check whether the child node is actually a parent of the + parent node, which would result in errors later one. + + The old test is now obsolete but remained, as it is not bad to test + more. + + Various remove methods were added, which are all implemented by an + static method, analog to the insert methods. + + The method SH_NodeFragment_get_attr provides a pointer to an Attr, by + its index. Note, that it directly points to the internal data, instead + of copying the data to a new Attr, which would be unneccessary overhead, + if only reading access is needed. That's why it is also a const pointer. + If the user intends to modify it, a copy should be taken via + SH_Attr_copy. + + Multiple insert methods allow either to add an existing Attr, or to + create a new one implicitly. If the Attr is not already used beforehand, + it is more efficient to call the attr_new methods. Also an old Attr is + freed, after it was inserted, thus it can't be used afterwards. This is + neccessary, as for efficiency reasons an array of Attr is used directly, + instead of the indirect approach of storing a pointer of Attr. This + means, that the contents of the Attr has to be copied to the internal + structure. If the old Attr would be left unfreed, there would be two + Attrs, the original one and the implicit one, referring to the same + data, which would lead to at least data corruption, or undefined + behaviour like a double free, which would be a serious threat for a + library which is to be used on a webserver. ... + For each of the two insert modes, there is a method to prepend, append + or insert at a specific position. An incorrect position is handled + inside of the external method and an E_VALUE is thrown. The internal + method doesn't handle this, so special care must be taken to not make + undefined behaviour. However enforcing this check would be unneccessary + overhead for the prepend and append methods, which are known to have + correct indicies, as well for other internal methods, where the internal + method may be used. + + Two alternatives are provided: remove_attr and pop_attr. While the + former free's the Attr's data, the latter allocates a new Attr, to store + and return the data. Both functionality is provided by a single + (internal) static method. + A Fragment can output it's html. If there is an error the method aborts and returns NULL. When the wrap mode is used, after each tag a newline is started. @@ -197,6 +337,15 @@ Fragment: a string longer than a single character). This arguments can't be set by the user, but are hardcoded (by now). + The to_html method generates also the html for the attributes. + Note, that there is no escaping of the quotes, the values are + wrapped with. But this is also somewhat consistent, as there is + no syntax validation on the tags either. + (i.e. no '<' inside of a tag) + + NodeFragment is virtually finished, but TextFragment is still + missing, as it depends on still not implemented functionality + of SH_Text. Validator: Validator serves as an syntax checker, i.e. it can be requested @@ -303,6 +452,11 @@ Text: this is a non trivial function, so don't use it to exhaustively. The method SH_Text_print just prints the whole string to stdout. + The function SH_Text_set_char allows to write a single character to a + position, that already exists in the text. Thus overwriting another + character. If the index is out of range, a value error is set and FALSE + is returned. + Tests: Tests are done using check, allowing to integrate the tests diff --git a/sefht.geany b/sefht.geany index d417693..7dda934 100644 --- a/sefht.geany +++ b/sefht.geany @@ -34,7 +34,7 @@ FILE_NAME_1=134;None;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2F FILE_NAME_2=1737;Sh;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fconfigure.ac;0;8 FILE_NAME_3=73;Make;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2FMakefile.am;0;8 FILE_NAME_4=19;C;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2Fmain.c;0;8 -FILE_NAME_5=10380;None;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fdocs%2Fcommit_messages.txt;0;8 +FILE_NAME_5=23257;None;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fdocs%2Fcommit_messages.txt;0;8 FILE_NAME_6=1867;Make;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2Flib%2FMakefile.am;0;8 FILE_NAME_7=18;C;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2Flib%2Fsefht%2Fcms.c;0;8 FILE_NAME_8=18;C;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2Flib%2Fsefht%2Fcms.h;0;8 -- GitLab