Skip to content
Snippets Groups Projects
Commit c865a775 authored by Jonathan Schöbel's avatar Jonathan Schöbel
Browse files

docs: collect commit messages #6

Up to (and including) commit 7e53e866
'Merge branch 'feature/fragment''
Thus now all commit messages are collected.
parent ce89d6ac
No related branches found
No related tags found
No related merge requests found
......@@ -228,6 +228,14 @@ Data:
for modules, manages the database connection and maybe also
contains some caches. At the moment it only provides access to
the Validator.
The two predicates SH_Data_check_tag and SH_Data_check_attr are
wrappers to the appropriate methods of the validator. These are
needed, as there shouldn't be direct calls to the internal
structure of SH_Data.
The modifying methods are not exposed, as the validator
shouldn't be changed while others depend on it, this has to be
implemented later.
Data also contains a wrapper for the self-closing tag predicate.
Attr:
The structure SH_Attr implements an HTML Attribute.
......@@ -284,6 +292,10 @@ Fragment:
possible, as this would lead to problems e.g. double free or
similar quirks.
NodeFragment now uses the validator to validate the tags. The
attributes aren't validated yet, as this is more complicated,
because the tag is needed for that.
The single method (formerly SH_NodeFragment_append_child) to add a child
at the end of the child list was replaced, by a bunch of methods to
insert a child at the beginning (SH_NodeFragment_prepend_child), at the
......@@ -357,6 +369,8 @@ Fragment:
A Fragment can output it's html. If there is an error the method
aborts and returns NULL.
This method also pays attention to self-closing tags, which is
determined via the validator.
When the wrap mode is used, after each tag a newline is started.
Also the html is indented, which can be configured by the
parameters indent_base, indent_step and indent_char. The
......@@ -454,6 +468,149 @@ Validator:
72(80)-column rule. It can't be abided without severely impacting the
readability of the code.
Originally the ids were intended to be useful for linking different
information together internally, and for providing references
externally. However, they weren't used internally, for this, pointers
seamed to be more useful, as they also allow to directly access the data
and also have a relation defined.
Regarding reference purposes, they aren't really needed, and it is more
convenient to directly use some strings, and they aren't more
performant, as there still have to be internal checks and looking for an
int isn't more performant, then looking for a pointer.
Also, they have to be stored, so they need more memory and also some
code, to be handled.
While it was very clever, the complex data structure of the tag array
introduced in 'Validator: restructured internal data (a0c9bb2)' comes
with a lot of runtime overhead. It reduces the calls to free and
realloc, when a lot of tags are deleted and inserted subsequently, but
burdens each call with a loop over the linked list of free blocks.
This is even more important, as validator must be fast in checking, as
this is done every time something is inserted into the DOM-tree, but has
not so tight requirements for registering new tags, as this is merely
done at startup time.
As the access must be fast, the tags are sorted when inserted, so that
the search can take place in log-time.
There is a method to add a set of tags to a validator on initialisation.
First this removes a user application from the burden of maintaining the
html spec and also is more performant, as a lot of tags are to be
inserted at once, so there aren't multiple allocation calls.
As the validator needs the tags to be in order, the tags must be sorted
on insertion. Of course it would be easier for the code, if the tags
were already in order, but first there could be easily a mistake and
second sorting the tags by an algorithm allows the tags to be specified
in a logically grouped and those more maintainable order.
For the sorting, insertion sort is used. Of course it has a worse
quadratic time complexity, but in a constructor, I wouldn't introduce
the overhead of memory managment a heap- or mergesort would introduce
and in-place sorting is also out, because the data lies in ro-memory.
Thus I choose an algorithm with constant space complexity. Also the
'long' running time is not so important, as the initilization only runs
at startup once and the tags are not likely to exceed a few hundred so
even a quadratic time isn't that bad.
Each tag has a type as defined by the html spec. This must be provided
on registration. Implicitly registering tags, when an attribute is
registered can't be done anymore, as the type information would be
missing.
The added parameterin register_tag, as well as the change of behaviourin
register_attr has broken a lot of tests, that had to be adjusted
therefor.
Added self-closing predicate. Other predicates may follow.
The Validator contains already all HTML5 tags.
Tags according to:
https://html.spec.whatwg.org/dev/indices.html#elements-3
Types according to:
https://html.spec.whatwg.org/multipage/syntax.html#elements-2
Retrieved 04. 10. 2023
A attribute can be deregistered by calling SH_Validator_deregister_attr.
Note that deregistering an attr, that was never registered is considered
an error, but this may change, as technically it is not registered
afterwards and sometimes (i.e. for a blacklist) it might be preferable
to ensure, that a specific attr is not registered, but it is not clear
whether there should be an error or not.
Also the deallocating of the data used for an attr was moved to an extra
method, as this is needed in several locations and it might be subject
to change.
The Validator can check if a attribute is allowed in a tag. It does so
by associating allowed tags with attributes. This is done in that way,
to support also attributes which are allowed for every tag (global
attributes), but this is not yet supported. So some functions allow for
NULL to be passed and some will still crash.
The predicate SH_Validator_check_attr returns whether an attribute is
allowed for a specific tag. If tag is NULL, it returns whether an attr
is allowed at all, not whether it is allowed for every tag. For this
another predicate will be provided, when this is to be implemented.
The method SH_Validator_register_attr registers an tag-attr combination.
Note, that it will automatically call SH_Validator_register_tag, if the
tag doesn't exist. Later it will be possible, to set tag to NULL to
register a global attribute, but for now the method will crash.
The method SH_Validator_deregister_attr removes a tag-attr combination
registered earlier. Note, that deregistering a non existent combination
will result in an error. This behaviour is arguable and might be subject
to change. When setting only tag to NULL, all tags for this attribute
are deregistered. When setting only attr to NULL, all attrs for this tag
are deregistered. This might suffer from problems, if this involves some
attrs, that are global. Also this will use the internal method
remove_tag_for_all_attrs, which has the problem, that it might fail
partially. Normally when failing all functions revert the program to the
same state, as it was before the call. This function however is
different, as if it fails there might be some combinations, that haven't
been removed, but others are already. Nevertheless, the validator is
still in a valid state, so it is possible to call this function a second
time, but it is not sure, which combinations are already deregistered.
As the attrs also use the internal strings of the tags, it must be
ensured, when a tag is deregistered, that all remaining references are
removed, otherwise there would be dangling pointers. Note, that for this
also remove_tag_for_all_attrs is used, so the method
SH_Validator_deregister_tag suffers from the same problems listed above.
Also if this internal method fails, the tag won't be removed at all.
Similar to the tags, the attributes can be initialized. Missing tags are
automatically added. The declaration syntax is currently a bit annoying,
as the tags, that belong to an attribute, either have to be declared
explicitly or a pointer to the tag declaration must be given, but then
only concurrent tags are possible.
Support for global attributes is likewise missing; it must be ensured,
that (tag_n != 0) && (tags != NULL). Otherwise validator will be
inconsistent and there might be a bug.
Global attributes are represented by empty attributes. A global
attribute is an attribute, that is accepted for any tag.
It is refused to remove a specific tag for a global attribute, as this
would mean to "localize" the tag, thus making it not global anymore.
The method to do that and a predicate for globalness is missing yet.
Deregistering a global attribute normally is not possible, as basically
every other tag has to be added. This was implemented now.
Originally it was intended to provide the caller with the information,
that a global attribute has to be converted into a local one before
removal. However such internals should not be exposed to the caller. As
it stands there is no real reason to inform a caller, whether an
attribute is local or global. Also, there is a problem that the
predicate is burdened with the possibility, that the attribute doesn't
exists, thus it can't return a boolean directly. Both is why, the
predicate isn't added yet.
Also a bug was detected in the method remove_tag_for_all_attrs. It
removes an attribute while also iterating over it, thus potentially
skipping over some attribute and maybe also invoking undefined behaviour
by deallocating space after the array.
Copying a Validator could be useful if multiple html versions are to be
supported. Another use case is a blacklist XSS-Scanner.
......@@ -565,9 +722,7 @@ Tests:
passed to another unit.
Because sometimes an overflow condition is checked, it is
necessary to include the sourcefile into the test, instead of
linking against the objectfile. This also allows for the
separate testing of static functions, as the static keyword
can be overridden with an empty macro.
linking against the objectfile.
Sometimes it isn't possible to check for correct overflow
detection by setting some number to ..._MAX, because this
number is used, thus a SIGSEGV would be raised. This is solved
......
......@@ -34,7 +34,7 @@ FILE_NAME_1=134;None;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2F
FILE_NAME_2=1737;Sh;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fconfigure.ac;0;8
FILE_NAME_3=73;Make;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2FMakefile.am;0;8
FILE_NAME_4=19;C;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2Fmain.c;0;8
FILE_NAME_5=3555;None;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fdocs%2Fcommit_messages.txt;0;8
FILE_NAME_5=31034;None;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fdocs%2Fcommit_messages.txt;0;8
FILE_NAME_6=1867;Make;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2Flib%2FMakefile.am;0;8
FILE_NAME_7=18;C;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2Flib%2Fsefht%2Fcms.c;0;8
FILE_NAME_8=18;C;0;EUTF-8;1;1;0;%2Fhome%2Fjonathan%2FDokumente%2Fprojekte%2Fprgm%2Finternet%2Fweb%2FSeFHT%2Fsrc%2Flib%2Fsefht%2Fcms.h;0;8
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment