external/uBlock - Forgejo: Beyond coding. We forge.

external/uBlock

mirror of https://github.com/gorhill/uBlock.git synced 2024-11-11 17:41:03 +01:00

Author	SHA1	Message	Date
Raymond Hill	be2a950541	Code review of HNTrie/staticNetFilteringEngine - Remove HNTrieContainer class from global context by storing it as a property of µBlock. - Use block scope to isolate HNTrie-related constants from global context. - Prevent filters which are pure IP address from being stored in an HNTrie instance -- as this could cause false positives.	2019-06-19 10:00:19 -04:00
Raymond Hill	39e2a03edb	Fix comment	2019-05-14 09:31:51 -04:00
Raymond Hill	3692bb4ada	Add HNTrieRef.dump() and STrieRef.dump() as dev tool To be used at the console, as an investigation tool for development purpose. Using it to verify the content of the largest FilterHostnameDict instance, I spotted an all-uppercase hostname in the HNTrieRef instance: µBlock.staticNetFilteringEngine.categories.get(0).get(0x10000000).dict.dump(); Thus the changes to static-net-filtering.js are to fix the erroneous insertion of filters with uppercase characters. The single instance found was a hostname entry in Malware Domain List (TRIANGLESERVICESLTD dot COM).	2019-05-06 11:12:39 -04:00
Raymond Hill	42bf659695	Revert "Order HNTrie nodes alphabetically to allow for early bailout" This reverts commit `f5f9e05071`.	2019-04-30 07:00:52 -04:00
Raymond Hill	f5f9e05071	Order HNTrie nodes alphabetically to allow for early bailout This commit implements the alphabetical ordering of HNTrie nodes, so as to make it possible to bail out early at HNTrie.matches() time. Contrary to what I expected, there is no performance gain observed to HNTrie.matches() as per benchmarks -- I find the results perplexing. Because of this I will revert this commit immediately. The purpose of this commit is to record the changes so that I can bring them back to life in the future whenever I want to investigate further.	2019-04-30 06:47:54 -04:00
Raymond Hill	adabb56dc9	Do not store impossible to match filters in HNTrie Consider the two following filters: example.com www.example.com This commit make it so that if the first filter is already present in a given HNTrie, the second filter will not be stored, since HNTrie will _always_ return the first filter as a match whenever the hostname to match is example.com or any subdomain of example.com. The detection of such pointless filters is virtually free when adding a hostname to an HNTrie instance (given how data is stored in the trie), so in practice no overhead is incurred to detect such pointless filters. The ability to ignore impossible to match filters in HNTrie instances will _especially_ benefit those using large hosts files. Examples of how this helps using real configurations: - Default lists: 444 filters out of 100,382 were ignored as a result of this commit. - Default lists + "Energized Ultimate Protection": 283,669 filters out of 903,235 were ignored as a result of this commit. Side note: There was no measurable difference between the two configurations above in the performance of the matching algorithm as reported by the built-in benchmark tool.	2019-04-29 13:15:16 -04:00
Raymond Hill	e0d2285da0	Convert HNTrie code to ES6 `class`	2019-04-25 19:38:07 -04:00
Raymond Hill	155abfba18	Cache and reuse result of HNTrieRef.matches() when possible Due to how web pages typically load secondary resources and due to how HNTrieContainer instances are used in uBO, there is a great likelihood that the result of a previous call to HNTrieRef.matches() can be reused in a subsequent call. This has been confirmed by instrumenting HNTrieRef.matches(). Since uBO uses distinct HNTrieContainer instances to either match against the request or the origin hostnames, this means a high likelihood of repeated calls to HNTrieRef.matches() with the same hostname as argument, hence a performance gain when caching the argument+result -- as despite that HNTrie.matches() is fast, comparing two short strings is even faster if this allows to skip HNTrie.matches() altogether.	2019-04-25 18:36:03 -04:00
Raymond Hill	fa83744b58	Use a sequence of base 64 numbers to encode array buffers The purpose of using a custom base128 encoder is to convert array buffers into strings, to allow a direct string-to-array buffer conversion at load time: string => array buffer Whereas a JSON array would require an extra step: JSON array as string => JS array => array buffer Turns out that the current use of a custom base128 encoding results in a significantly larger selfie storage usage when converting array buffers into strings. Speculation: possibly the browser convert the strings to save into JSON strings internally. Since the custom base128 encoder is likely to cause the resulting string to contain a lot of unprintable ASCII characters, these will need to be escaped when converted to JSON -- escaped characters occupy more space than non-escaped ones. Using a sequence of base 64 numbers means only printable will be present in the output string, hence no escaping necessary. I have observed significant reduction in storage usage for selfie purpose.	2019-04-20 09:06:54 -04:00
Raymond Hill	008370e4b9	Fix https://github.com/uBlockOrigin/uBlock-issues/issues/461 uBO will fallback using a JSON string when trying to encode an array buffer in Chromium version 59 and earlier.	2019-03-16 09:00:31 -04:00
Raymond Hill	ed7e34fb07	Refactor selfie generation into a more flexible persistence mechanism The motivation is to address the higher peak memory usage at launch time with 3rd-gen HNTrie when a selfie was present. The selfie generation prior to this change was to collect all filtering data into a single data structure, and then to serialize that whole structure at once into storage (using JSON.stringify). However, HNTrie serialization requires that a large UintArray32 be converted into a plain JS array, which itslef would be indirectly converted into a JSON string. This was the main reason why peak memory usage would be higher at launch from selfie, since the JSON string would need to be wholly unserialized into JS objects, which themselves would need to be converted into more specialized data structures (like that Uint32Array one). The solution to lower peak memory usage at launch is to refactor selfie generation to allow a more piecemeal approach: each filtering component is given the ability to serialize itself rather than to be forced to be embedded in the master selfie. With this approach, the HNTrie buffer can now serialize to its own storage by converting the buffer data directly into a string which can be directly sent to storage. This avoiding expensive intermediate steps such as converting into a JS array and then to a JSON string. As part of the refactoring, there was also opportunistic code upgrade to ES6 and Promise (eventually all of uBO's code will be proper ES6). Additionally, the polyfill to bring getBytesInUse() to Firefox has been revisited to replace the rather expensive previous implementation with an implementation with virtually no overhead.	2019-02-14 13:33:55 -05:00
Raymond Hill	fc03782985	Ensure that WASM module was actually loaded	2019-02-01 09:09:51 -05:00
Raymond Hill	69c87c5117	Fix Promise chain of WASM module load operations The Promise chain was not properly designed for WASM module loading. This became apparent when removing WASM modules from Opera build[1]. The problem was that errors thrown by fetch() -- used to load WASM modules -- were not properly handled. [1] Opera refuses updating uBO if there are unrecognized file types in the package, and `.wasm`/`.wat` files are not recognized by Opera uploader.	2019-02-01 08:20:43 -05:00
Raymond Hill	1b6fea16da	3rd-gen hntrie, suitable for large set of hostnames	2018-12-04 13:02:09 -05:00
Raymond Hill	2a91a685ce	code review: fix handling of too long needles	2018-11-19 14:04:26 -05:00
Raymond Hill	2189f020df	add new advanced setting to disable use of WASM for dev purpose	2018-11-16 10:19:06 -05:00
Raymond Hill	19b7cbca55	minor review of hntrie code	2018-11-06 13:38:37 -02:00
Raymond Hill	a42513aa2f	minor code review	2018-11-04 19:26:02 -02:00
Raymond Hill	95899a0d1d	be explicit about where the related wasm file is fetched	2018-11-04 18:52:25 -02:00
Raymond Hill	d7d544cda0	Squashed commit of the following: commit 7c6cacc59b27660fabacb55d668ef099b222a9e6 Author: Raymond Hill <rhill@raymondhill.net> Date: Sat Nov 3 08:52:51 2018 -0300 code review: finalize support for wasm-based hntrie commit 8596ed80e3bdac2c36e3c860b51e7189f6bc8487 Merge: cbe1f2e `000eb82` Author: Raymond Hill <rhill@raymondhill.net> Date: Sat Nov 3 08:41:40 2018 -0300 Merge branch 'master' of github.com:gorhill/uBlock into trie-wasm commit cbe1f2e2f38484d42af3204ec7f1b5decd30f99e Merge: 270fc7f `dbb7e80` Author: Raymond Hill <rhill@raymondhill.net> Date: Fri Nov 2 17:43:20 2018 -0300 Merge branch 'master' of github.com:gorhill/uBlock into trie-wasm commit 270fc7f9b3b73d79e6355522c1a42ce782fe7e5c Merge: d2a89cf `d693d4f` Author: Raymond Hill <rhill@raymondhill.net> Date: Fri Nov 2 16:21:08 2018 -0300 Merge branch 'master' of github.com:gorhill/uBlock into trie-wasm commit d2a89cf28f0816ffd4617c2c7b4ccfcdcc30e1b4 Merge: d7afc78 `649f82f` Author: Raymond Hill <rhill@raymondhill.net> Date: Fri Nov 2 14:54:58 2018 -0300 Merge branch 'master' of github.com:gorhill/uBlock into trie-wasm commit d7afc78b5f5675d7d34c5a1d0ec3099a77caef49 Author: Raymond Hill <rhill@raymondhill.net> Date: Fri Nov 2 13:56:11 2018 -0300 finalize wasm-based hntrie implementation commit e7b9e043cf36ad055791713e34eb0322dec84627 Author: Raymond Hill <rhill@raymondhill.net> Date: Fri Nov 2 08:14:02 2018 -0300 add first-pass implementation of wasm version of hntrie commit 1015cb34624f3ef73ace58b58fe4e03dfc59897f Author: Raymond Hill <rhill@raymondhill.net> Date: Wed Oct 31 17:16:47 2018 -0300 back up draft work toward experimenting with wasm hntries	2018-11-03 08:58:46 -03:00
gorhill	e83ffde5af	code review for #3328	2017-12-08 07:07:05 -05:00
gorhill	c7e8b65b6c	fix #3328	2017-12-08 00:33:02 -05:00
gorhill	4d20950dfa	save investigative work for the future re. wasm	2017-11-05 12:33:46 -05:00
gorhill	da605f53a6	code review: avoid pointless test for single-char cells	2017-11-05 06:45:43 -05:00
gorhill	22c460d52f	just edit comments	2017-11-03 08:36:16 -04:00
gorhill	5928996f2a	address #3193	2017-11-02 15:49:11 -04:00