|
|
 |
 |
 |
 |
Design pattern question
Hello, I'm trying to solve the following issue which is puzzling me. I have a concrete class, say Parser, which provides some basic parsing functionality (such as reading word by word, line by line, etc) Then I have a second Parser-derived class, say HtmlParser, which besides providing the Parser functionality, provides extra features, such as read tag by tag, attributes, etc.. What I'd like to achieve is a "factory" which would return the proper class depending on the content type received, so I'd call this factory like factory.create("html");, etc The problem I'm facing is that the derived parser will contain more methods which are not available in the base class. So how can I achieve a generic behavior without relying on if/else/ switch constructs? So far the only way I see is dynamic casting the Parser pointer returned by the factory, but I feel this defeats the whole purpose of a factory. The other obvious approach would be putting all methods in the base class, but some of them wouldn't have any sense.. such as reading a tag from a generic text.. Any hint to where I can find some pointers? Thank you.
alebc @gmail.com wrote: > I'm trying to solve the following issue which is puzzling me. > I have a concrete class, say Parser, which provides some basic parsing > functionality (such as reading word by word, line by line, etc) > Then I have a second Parser-derived class, say HtmlParser, which > besides providing the Parser functionality, provides extra features, > such as read tag by tag, attributes, etc.. > What I'd like to achieve is a "factory" which would return the proper > class depending on the content type received, so I'd call this factory > like factory.create("html");, etc > The problem I'm facing is that the derived parser will contain more > methods which are not available in the base class. > So how can I achieve a generic behavior without relying on if/else/ > switch constructs? So far the only way I see is dynamic casting the > Parser pointer returned by the factory, but I feel this defeats the > whole purpose of a factory. > The other obvious approach would be putting all methods in the base > class, but some of them wouldn't have any sense.. such as reading a > tag from a generic text.. > Any hint to where I can find some pointers? Thank you.
Write the class that uses the parser first. Write it in such a way that it will work no matter what format the text is in. Then your problem will be solved. Think more abstractly.
alebc @gmail.com wrote: > Hello, > I'm trying to solve the following issue which is puzzling me. > I have a concrete class, say Parser, which provides some basic parsing > functionality (such as reading word by word, line by line, etc) > Then I have a second Parser-derived class, say HtmlParser, which > besides providing the Parser functionality, provides extra features, > such as read tag by tag, attributes, etc.. > What I'd like to achieve is a "factory" which would return the proper > class depending on the content type received, so I'd call this factory > like factory.create("html");, etc > The problem I'm facing is that the derived parser will contain more > methods which are not available in the base class. > So how can I achieve a generic behavior without relying on if/else/ > switch constructs? So far the only way I see is dynamic casting the > Parser pointer returned by the factory, but I feel this defeats the > whole purpose of a factory. > The other obvious approach would be putting all methods in the base > class, but some of them wouldn't have any sense.. such as reading a > tag from a generic text.. > Any hint to where I can find some pointers? Thank you.
Ok there are a couple issues revealed here: 1) From your description, you are writing a lexical analyzer (scanner and tokenizer), do not confuse a parser with a lexical analyzer. Start from here: http://en.wikipedia.org/wiki/Semantic_analysis_%28computer_science%29 Know exactly what you want and define your problem. Do you want a lexer or a parser? 2) A factor method works best when parallel hierarchies exists, you can formulate your hierarchies into a lexer hierarchy and a token hierarchy. Then a lexer class can create a token class through covariance and virtual methods: class lexer{ public: virtual token * create_token(const attr & at){ return new token(at); } virtual ~lexer(){}; };
class html : virtual public lexer { public: html_token * create_token(const attr & at){ return new html_token(at); } } };
class token { public: token(const attr & at){ // construct a token } };
class html_token : virtual public token{ pubilc: html_token(const attr & at){ // construct a html token } };
3) There is nothing wrong with if/else/switch constructs. In fact IMO that's only way to can initialize objects dynamically within the realm of C++. You can use platform specific feature, i.e. dynamic library and name resolution, to facilitate a more generic solution but that does not have anything to do with C++.
Hi, thanks a lot for your answer.. > Ok there are a couple issues revealed here: > 1) From your description, you are writing a lexical analyzer (scanner > and tokenizer), do not confuse a parser with a lexical analyzer. Start > from here: http://en.wikipedia.org/wiki/Semantic_analysis_%28computer_science%29 > Know exactly what you want and define your problem. Do you want a lexer > or a parser? Yes, thanks for letting me know. I didn't have enough knowledge on the separation of these processes. Actually, it seems I need both of them.. The idea is basically being able to interpret a http header, and also other protocols, say rtsp to put an example. Both have similarities, but there are some things specific to each protocol. Am I wrong if I say that in most cases the analyzing and parsing is done in the same place? My idea is simply being able to get a stream of data and then build a data structure holding the different headers, with its fields and attributes.. From what I think, I don't think I need to make a lex analyzer and then a parser.. I could do obth things at the time?
> 2) A factor method works best when parallel hierarchies exists, you can > formulate your hierarchies into a lexer hierarchy and a token hierarchy. > Then a lexer class can create a token class through covariance and > virtual methods: > class lexer{ > public: > virtual token * create_token(const attr & at){ > return new token(at); > } > virtual ~lexer(){};}; > class html : virtual public lexer { > public: > html_token * create_token(const attr & at){ > return new html_token(at); > } > } > }; > class token { > public: > token(const attr & at){ // construct a token > }}; > class html_token : virtual public token{ > pubilc: > html_token(const attr & at){ // construct a html token > }}; > 3) There is nothing wrong with if/else/switch constructs. In fact IMO > that's only way to can initialize objects dynamically within the realm > of C++. You can use platform specific feature, i.e. dynamic library and > name resolution, to facilitate a more generic solution but that does not > have anything to do with C++.
Thanks. I definetly need to give this more thought, or learn existing examples to see how they do this.
alebc @gmail.com wrote: > Hi, thanks a lot for your answer.. >> Ok there are a couple issues revealed here: >> 1) From your description, you are writing a lexical analyzer (scanner >> and tokenizer), do not confuse a parser with a lexical analyzer. Start >> from here:http://en.wikipedia.org/wiki/Semantic_analysis_%28computer_science%29 >> Know exactly what you want and define your problem. Do you want a lexer >> or a parser? > Yes, thanks for letting me know. I didn't have enough knowledge on the > separation of these processes. > Actually, it seems I need both of them.. > The idea is basically being able to interpret a http header, and also > other protocols, say rtsp to put an example. > Both have similarities, but there are some things specific to each > protocol. > Am I wrong if I say that in most cases the analyzing and parsing is > done in the same place?
They work this way: a parser is fed by a lexer. Usually a token signifies a start of a syntactic block that can be analyzed by a parser. They are not done in the same place (I assume you meant lexical and syntactic analysis) Check EBNF and Boost::spirit. In your case I don't think you need a recursive descent parser. A simple EBNF parser is most likely adequate.
> My idea is simply being able to get a stream of data and then build a > data structure holding the different headers, > with its fields and attributes.. From what I think, I don't think I > need to make a lex analyzer and then a parser.. I could do > obth things at the time? >> 2) A factor method works best when parallel hierarchies exists, you can >> formulate your hierarchies into a lexer hierarchy and a token hierarchy. >> Then a lexer class can create a token class through covariance and >> virtual methods: >> class lexer{ >> public: >> virtual token * create_token(const attr & at){ >> return new token(at); >> } >> virtual ~lexer(){};}; >> class html : virtual public lexer { >> public: >> html_token * create_token(const attr & at){ >> return new html_token(at); >> } >> } >> }; >> class token { >> public: >> token(const attr & at){ // construct a token >> }}; >> class html_token : virtual public token{ >> pubilc: >> html_token(const attr & at){ // construct a html token >> }}; >> 3) There is nothing wrong with if/else/switch constructs. In fact IMO >> that's only way to can initialize objects dynamically within the realm >> of C++. You can use platform specific feature, i.e. dynamic library and >> name resolution, to facilitate a more generic solution but that does not >> have anything to do with C++. > Thanks. I definetly need to give this more thought, or learn existing > examples to see how they do this.
In article <f446ie$v3@aioe.org>, fei@aepnetworks.com says... [ ... ] > Check EBNF and Boost::spirit. In your case I don't think you need a > recursive descent parser. A simple EBNF parser is most likely adequate.
There seems to be a bit of confusion here. EBNF and recursive descent are orthogonal. EBNF is a language in which you express a grammar -- i.e. you specify the language that will be accepted by the parser. Recursive descent is a method of implementing a parser. A recursive descent parser is a top-down parser. As you'd guess when it's expressed that way, the primary alternative is a bottom-up parser. Using Boost.spirit, you specify the input in a modified form of EBNF, and the library produces a recursive descent parser from that input. Bottom-up parsers are most often produced by yacc, bison, byacc, and other such parser generator tools. -- Later, Jerry. The universe is a figment of its own imagination.
|
 |
 |
 |
 |
|